Meta MTIA AI Chips: Scaling AI Infrastructure for Billions
Meta's Meta Training and Inference Accelerator (MTIA) represents a family of custom-designed AI chips developed in partnership with Broadcom to power AI experiences across Meta's platforms. These chips are specifically engineered to handle the demanding infrastructure challenges of serving diverse AI models at global scale while maintaining cost efficiency.
Key Features
- Four Generations in Two Years: MTIA 300, 400, 450, and 500 chips developed with rapid iteration cycles
- Modular Chiplet Design: Reusable building blocks for compute, I/O, and networking components
- PyTorch-Native Software Stack: Seamless integration with industry-standard ML frameworks
- High-Velocity Development: Approximately six-month chip development cycles to adapt to evolving AI techniques
- Inference-First Optimization: Specifically tuned for GenAI inference workloads while supporting training
- Hardware Innovations: Advanced HBM bandwidth, low-precision data types, and hardware acceleration for attention and FFN computation
Use Cases
- Ranking and Recommendation Systems: Powering personalized content delivery for billions of users
- Generative AI Inference: Efficiently running large language models like Llama
- AI-Powered Experiences: Supporting AI assistants, content recommendations, and other AI features across Meta platforms
- Production AI Infrastructure: Deployed at scale in Meta's data centers with hundreds of thousands of chips
Technical Specifications
The MTIA family shows rapid advancement with 4.5x HBM bandwidth increase and 25x compute FLOPS improvement from MTIA 300 to MTIA 500. The chips feature built-in NIC chiplets, dedicated message engines for communication collectives, and near-memory compute capabilities. The software stack includes PyTorch integration, vLLM support, Triton-based compilers, and production-grade monitoring tools.

