MAX 25.2 is turning heads — and for good reason. This powerful update delivers industry-leading performance for large language models on NVIDIA GPUs, all without CUDA. MAX 25.2 builds on the momentum of 25.1 and introduces major upgrades to help you build GenAI systems that are faster, leaner, and easier to scale. Highlights include:
- State-of-the-art performance on H100 and H200 GPUs
- Multi-GPU support
- Enhanced LLM serving
- Ultra-slim containers for rapid deployment
- Mojo-powered GPU programming
- Advanced features like GPTQ quantization
Curious to see it in action? Join us at Modular HQ on April 24 for an evening dedicated to GPU programming with Mojo and MAX: Next-Gen GPU Programming: Hands-On with Mojo & MAX. Hear from Chris Lattner and Jack Clayton, meet fellow developers, and explore how these tools are reshaping what’s possible in AI.
And if you haven’t already, check out the latest community meeting recording—featuring critical updates to the MAX and Mojo license, a live demo of Kelvin, a typesafe dimensional analysis library for Mojo, and Q&A with the team.
Blogs, Tutorials, and Videos
- MAX 25.2 delivers industry-leading performance on NVIDIA GPUs, built from the ground up without CUDA, to power faster, more responsive, and more customizable GenAI deployments at scale. Dive into the release blog post.
- Catch up on the latest installments in Chris Lattner’s Democratizing AI Compute blog post series:
- In part 5, What about CUDA C++ alternatives like OpenCL?, we explore why C++ alternatives to CUDA failed to gain traction in AI, challenges faced, and key lessons for the future of GPU programming.
- Part 6, What about AI compilers (TVM and XLA)?, breaks down how kernel fusion boosts performance, why AI compilers struggle with GenAI workloads, and what prevented TVM and XLA from fully unlocking GPU performance or democratizing AI compute.
- Part 7, What about Triton and Python eDSLs?, examines Python eDSLs and takes a closer look at Triton–one of the most popular approaches in this space–as well as a few others.
- In part 8, What about the MLIR compiler infrastructure?, Chris Lattner shares the origin story of the MLIR compiler framework: how it started, what it changed, and the power struggles along the way.
- Ready to make your GPUs go brrr with Mojo? The Mojo Manual just got an upgrade — now with a deep dive into GPU programming, designed for everyone from experienced GPU developers to complete beginners.
- Check out our new recipes, which are step-by-step guides to deploying GenAI with MAX:
- GPU Functions in Mojo
- Custom Operations: An Introduction
- Custom Operations: Optimizing Matrix Multiplication
- Custom Operations: Applications In AI Models
- Code Executor Agent With E2B Sandbox
- Auto Documentation Agent For GitHub Repository
- AI Agent with DeepSeek-R1, AutoGen And MAX Serve
- Use AnythingLLM and Llama3.1 With MAX Serve
- The MAX 25.2 livestream brought it all: Mojo magic from Connor, MAX insights with Brad, founder vision from Tim, a recap of Modular at NVIDIA GTC, and live Q&A with the team.
- Missed Chris Lattner’s lightning talk at Modular’s NVIDIA GTC booth? The recording is available on YouTube. In just a few minutes, Chris breaks down what’s holding GPU programming back—and how MAX and Mojo are making it fast, scalable, and developer-friendly again.
- Community meeting #14 featured a Q&A with Chris Lattner on his Democratizing AI Compute blog post series, a demo from Brian on his Conway’s Game of Life MAX custom op, and a presentation from Hammad on CombustUI, a Mojo GUI library.
- During community meeting #15, we announced key updates to the MAX and Mojo license, and enjoyed a live demo of Kelvin, a new typesafe dimensional analysis library for Mojo.
Awesome MAX + Mojo
- jake-danton created a Mojo Gameboy emulator.
- forfudan launched DeciMojo, a correctly-rounded, fixed-point decimal arithmetic library in Mojo.
- bgreni shared ChronoFlare, a library for representing time intervals in a type-safe way, inspired by std::chrono::duration. bgreni also shared Kelvin, a typesafe dimensional analysis library for scientific computing.
- Samufi developed Larecs, a performance-oriented archetype-based ECS for Mojo, based on the Go ECS Arche.
New packages available in the Modular community channel via Magic:
- Mojo-websockets: a lightweight Mojo library for handling WebSocket connections

- Mimage: a Mojo library for reading images

- Mosaic: an open source computer vision Mojo library

- Decimojo: a comprehensive decimal and integer mathematics Mojo library

- Infrared: a geometric algebra Mojo library
Open-Source Contributions
If you’ve recently had your first PR merged, message Caroline Frasca in the forum to claim your epic Modular swag!
Check out the recently merged contributions from our valuable community members:
- martinvuyk [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42]
- fnands [1]
- rd4com [1][2][3][4]
- soraros [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22]
- illiasheshyn [1][2][3]
- msaelices [1]
- Ahajha [1][2]
- bgreni [1][2][3]
- izo0x90 [1]
- christianbator [1]
- eltociear [1]
- kasmith11 [1]
- thatstoasty [1][2][3][4][5]
- KamilGucik [1]
Coming Up
Next-Gen GPU Programming: Hands-On with Mojo & MAX @ Modular HQ
Join fellow developers and tech enthusiasts for an evening exploring the future of GPU programming. Chris Lattner will give a talk on GPU programmability with Mojo and MAX, followed by a GPU programming demo from Jack Clayton.
The event will include time for discussion and networking. Refreshments and swag provided.