Blog

Democratizing AI Compute Series
Go behind the scenes of the AI industry with Chris Lattner

Modular 25.6: Unifying the latest GPUs from NVIDIA, AMD, and Apple
We’re excited to announce Modular Platform 25.6 – a major milestone in our mission to build AI’s unified compute layer. With 25.6, we’re delivering the clearest proof yet of our mission: a unified compute layer that spans from laptops to the world’s most powerful datacenter GPUs. The platform now delivers:

Modular Platform 25.5: Introducing Large Scale Batch Inference
Modular Platform 25.5 is here, and introduces Large Scale Batch Inference: a highly asynchronous, at-scale batch API built on open standards and powered by Mammoth. We're launching this new capability through our partner SF Compute, enabling high-volume AI performance with a fast, accurate, and efficient platform that seamlessly scales workloads across any hardware.

Modular 25.4: One Container, AMD and NVIDIA GPUs, No Lock-In
We're excited to announce Modular Platform 25.4, a major release that brings the full power of AMD GPUs to our entire platform. This release marks a major leap toward democratizing access to high-performance AI by enabling seamless portability to AMD GPUs.
Sign up for our newsletter
Get all our latest news, announcements and updates delivered directly to your inbox. Unsubscribe at anytime.
Thank you for your submission.
Your report has been received and is being reviewed by the Sales team. A member from our team will reach out to you shortly.
Thank you,
Modular Sales Team
Start building with Modular
Quick start resources
Get started guide
With just a few commands, you can install MAX as a conda package and deploy a GenAI model on a local endpoint.
Browse open source models
500+ supported models, most of which have been optimized for lightning fast speed on the Modular platform.
Find examples
Follow step by step recipes to build Agents, chatbots, and more with MAX.