Apple has released new technical details on the foundation models powering Apple Intelligence, its suite of generative AI features introduced across iOS, macOS, and other platforms. The report outlines the development of two core language models: a ∼3B-parameter on-device model optimized for Apple silicon, and a server-side model utilizing a novel Parallel-Track Mixture-of-Experts (PT-MoE) architecture, which runs on Apple’s Private Cloud Compute infrastructure.
Both models support multilingual and multimodal input and are designed to strike a balance between performance, efficiency, and privacy. According to the report, the models were trained using responsibly sourced web data, licensed corpora, and synthetic datasets, then fine-tuned with supervised learning and reinforcement learning from human feedback (RLHF). The on-device model supports 16 languages, while broader multilingual support is being phased in.
Privacy-Preserving AI at Scale
Apple’s architectural choices emphasize efficiency and user privacy. The on-device model features include KV-cache sharing and 2-bit quantization-aware training, enabling smooth operation on Apple chips. Meanwhile, the server model uses a track-parallel transformer design combined with MoE layers to reduce synchronization overhead while maintaining accuracy. Both models have been compressed using advanced techniques to reduce memory usage without significantly compromising performance.
Apple’s Private Cloud Compute allows the server-side model to process requests securely, with guarantees that user data remains protected. No personal data or user interactions are used to train the models.
Multilingual Focus and Tool Use
With increased domain-specific data and enhanced sampling strategies, the models now support expanded multilingual understanding, including OCR and text-rich image interpretation across 15 languages. During evaluation, the on-device model performed well against comparably sized models like Qwen-2.5 and Gemma-3, while the server model showed competitive results against LLaMA 4 Scout, albeit behind massive proprietary systems like GPT-4o.
The models also enable structured tool-calling and guided generation. Through Apple’s new Foundation Models framework, developers can build AI-powered features directly into their apps with minimal code, benefiting from structured decoding and safety-focused defaults.
Built with Responsible AI Principles
The report reinforces Apple’s Responsible AI commitment, outlining safety guardrails, locale-specific evaluation, and content moderation protocols. As Apple expands support to new regions and languages, it continues to tailor risk mitigation strategies and evaluation metrics to local norms and user expectations.
This update marks a significant step in Apple’s AI roadmap, positioning its models not only as privacy-conscious but increasingly capable in complex multilingual, multimodal contexts.

