MODULAR PLATFORM

Mammoth

Scale intelligently to any cluster

MAX Framework

Modular Accelerated Xecution

Mojo Language

The best GPU & CPU performance

Editions & Pricing

Free for most users, paid support for scaled enterprise deployments

AI Agents

Build agent workflows

RAG & CAG

AI retrieval and controlled generation

Chatbots

Conversations and interactions

Code Generation

Work with top open code gen models

Batch processing

Improve resource utilization

AI Inference

Fast, Scalable AI Inference

Research

Model & kernel development

Resources

Docs

Get up and running. Fast.

Tutorials

Build amazing things

Recipes

Step-by-step guides

Models

500+ supported open models

GPU Puzzles

Learn GPU Programming

Mojo Playground

Try Mojo now

About

Build AI for anyone, anywhere.

Careers

We’re currently hiring!

Culture

What we believe

Talk to an AI Expert

Get Started Talk to an AI Expert

Talk to Us Get started

All Articles (X)

Topics

Topic

Popular

🔥

Community

Culture

Developer

Engineering

Product

Industry

Company

Case Study

Authors

Abdul Dakkak

Alex Kirchhoff

Alexandr Nikitin

Andrew Luo

Arjun Surendran

Arthur Evans

Ash Vardanian

Austin Doolittle

Bill Welense

Billy Zhu

Blake Huang

Brendan Duke

Brendan Hansknecht

Chad Jarvis

Chris Hoge

Chris Lattner

Dan Moldovan

Deep Dhillon

Denali Lumma

Ehsan M. Kermani

Eric Johnson

Evan Ovadia

Fabian Tschopp

Feras Boulala

Goldie Gadde

Hengjie Wang

Ian Tramble

Jack Clayton

Jakub Tucholski

Jeff Niu

Joe Loser

Joe Williams

Kalor Lewis

Kate Caldwell

Konstantinos Krommydas

Laszlo Kindrat

Liam Stewart

Liina Lind

Matthew Brookhart

Max Hutchinson

Mike Edwards

Mikhail Zolotukhin

Mostafa Hagog

Navroop Bath

Paige Bedwell

Patrick Beck

Rashid Kaleem

Robert Webb

Ryan Guo

Scott Main

Sean Paradiso

Shashank Prasanna

Shashank Sharma

Stef Lindall

Steffi Stumpos

Stephen McGroarty

Swetha Muniraju

Tatiana Shpeisman

Tim Davis

Tracy Sharpe

Tristan Konolige

Tyler Kenney

Walter Erquinigo

Weiwei Chen

William Hatch

Yihua Lou

Zac Bowling

Clear

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

🚨

NEW

Product

MAX 25.1 - Introducing MAX Builds

February 18, 2025

Modular Team

Read

🚨

NEW

Industry

How did CUDA succeed? (Democratizing AI Compute, Part 3)

If we as an ecosystem hope to make progress, we need to understand how the CUDA software empire became so dominant.

February 12, 2025

Chris Lattner

Read

🚨

NEW

Product

Paged Attention & Prefix Caching Now Available in MAX Serve

PagedAttention & Prefix Caching Now Available in MAX Serve

February 6, 2025

Ehsan M. Kermani

Read

🚨

NEW

Industry

What exactly is “CUDA”? (Democratizing AI Compute, Part 2)

February 5, 2025

Chris Lattner

Read

🚨

NEW

Industry

DeepSeek's Impact on AI (Democratizing AI Compute, Part 1)

Part 1 of an article that explores the future of hardware acceleration for AI beyond CUDA, framed in the context of the release of DeepSeek

January 30, 2025

Chris Lattner

Read

🚨

NEW

Engineering

Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling

January 30, 2025

Ehsan M. Kermani

Read

🚨

NEW

Developer

Use MAX with Open WebUI for RAG and Web Search

Learn how quickly MAX and Open WebUI get you up-and-running with RAG, web search, and Llama 3.1 on GPU

January 23, 2025

Bill Welense

Read

🚨

NEW

Developer

Hands-on with Mojo 24.6

Mojo 24.6 introduces key improvements in argument conventions, memory management, and reference tracking, enhancing code clarity and safety with features like 'mut' for mutable arguments, 'origins' for references, and new collection types.

January 21, 2025

Ehsan M. Kermani

Read

🚨

NEW

Developer

Evaluating Llama Guard with MAX 24.6 and Hugging Face

Imagine unlocking a world of open innovation while ensuring secure, reliable, and enterprise-ready Gen AI deployments—MAX 24.6 enables enterprise AI teams to seamlessly run a vast range of cutting-edge AI models from Hugging Face on NVIDIA GPUs.

December 19, 2024

Bill Welense

Read

🚨

NEW

Engineering

MAX GPU: State of the Art Throughput on a New GenAI platform

Measuring state of the art GPU performance compared to vLLM on Modular's MAX 24.6

December 17, 2024

Max Hutchinson

Tyler Kenney

Read

+ Load 10 more

3 / 11

🤔

No results for this query

Reset all filters

Start building with Modular

Get started - Docs

Quick start resources

Get started guide
With just a few commands, you can install MAX as a conda package and deploy a GenAI model on a local endpoint.
Read Guide
Browse open source models
500+ supported models, most of which have been optimized for lightning fast speed on the Modular platform.
Browse Models
Find examples
Follow step by step recipes to build Agents, chatbots, and more with MAX.
View Recipes

Product
Resources
Solutions
AI Agents
AI Inference
Batch processing
Chatbots
Code Generation
RAG & CAG
Research
DEVELOPERS
Connect
Company

Terms, Privacy & Acceptable Use