January 22, 2025

Modverse #45: MAX 24.6, featuring our first release of MAX GPU

Three years ago, we began reimagining AI development by rebuilding its infrastructure to be more performant, programmable, and portable. Just a few weeks ago, we introduced MAX 24.6, featuring MAX GPU–a preview of the first vertically integrated generative AI serving stack that eliminates dependency on vendor-specific libraries like NVIDIA CUDA.

MAX GPU is built on two groundbreaking technologies:

  • MAX Engine: A high-performance AI model compiler and runtime supporting vendor-agnostic Mojo GPU kernels for NVIDIA GPUs.
  • MAX Serve: A Python-native serving layer engineered for LLMs, handling complex request batching and scheduling for reliable performance under heavy workloads.

We appreciate the community's feedback and questions about this exciting release. As we continue to refine MAX GPU, we encourage you to continue sharing your thoughts, ideas, and experiences in the forum and Discord. Your input plays a key role in shaping the future of this technology.

Blogs, Tutorials, and Videos

  • Community Meeting #11 was a special event called Modular milestones: GPUs, 2024 reflections, and the road ahead. Chris Lattner and team shared updates on MAX 24.6 and MAX GPU, our Mojo open source approach, the standard library contribution policy, and more.
  • In Community Meeting #12, Max Hutchinson and Tyler Kenney covered MAX GPU benchmarking, Brad Larson shared his new MAX-powered image processing framework, MAX-CV, and the team answered questions from the community.
  • MAX 24.6 is here, featuring MAX GPU, the first vertically integrated gen AI serving stack, with SoTA performance on NVIDIA A100 and support for deployment across all major clouds. Check out the announcement.
  • MAX GPU preview delivers SOTA LLM throughput performance on NVIDIA A100s. Dig into our benchmarking methodology, results, and takeaways with our blog post.
  • Want to see MAX 24.6 in action? Build your own GPU-accelerated chat app with MAX Serve and Llama 3.1 by following our blog post.
  • Mojo 24.6 brings important changes to argument conventions and lifetimes management, making Mojo’s memory and ownership model more intuitive while maintaining strong safety guarantees. Deep dive into the exciting Mojo 24.6 updates with our blog post.

Awesome MAX + Mojo

Want to learn more about these projects? Chat with their creators in the #community-showcase channel of our Discord server or the Community Showcase category of our forum.

HEPJo's mascot

Open-Source Contributions

If you’ve recently had your first PR merged, message Caroline Frasca (@Caroline_Frasca) on Discord or the forum to claim your epic Mojo swag!

Check out the recently merged contributions from our valued community members:

Coming Up

Democratize Intelligence Summit

Chris Lattner will speak at the second Democratize Intelligence Summit this Friday, January 24th, at 11 AM PT in San Francisco. Chris will also participate in a panel discussion at 12:35 PM PT.

Modular Community Meeting

Our next community meeting will take place on February 3rd at 10 AM PT. RSVP in Discord.

Sign up for our newsletter

Get all our latest news, announcements and updates delivered directly to your inbox. Unsubscribe at anytime.

Thank you for your submission.

Your report has been received and is being reviewed by the Sales team. A member from our team will reach out to you shortly.

Thank you,

Modular Sales Team

Caroline Frasca

Technical Community Manager