Max  unifies, extends  and superpowers your AI

MAX is an integrated, composable suite of products that simplifies your AI infrastructure so you can develop, deploy, and innovate faster.

Download & install in your terminal now

curl -s https://get.modular.com | sh -

Available on Linux & Windows (WSL) now, Mac 🍎 soon!

By downloading, you accept our Terms.

Modular
Accelerated
Xecution

MAX is composed of the MAX Engine, MAX Serving, and the Mojo programming language – everything you need to deploy low-latency, high-throughput inference pipelines into production.

Unified, simple infrastructure

MAX simplifies inference by unifying AI development frameworks and hardware backends.

Unparalleled performance

MAX delivers industry-leading latency and efficiency gains, helping you productionize larger models and lower costs.

Just works

MAX works out of the box without asking you to rewrite your stack or configure a bunch of knobs.

Plug into what you already use.

Drop-in compatible with all your existing models, including all AI framework ops, quantized types, dynamic shapes, and your custom ops.

Integrations with industry standard inference servers, including Triton, and seamless deployment to existing cloud systems, such as Kubernetes.

Interoperable with your existing Python and C/C++ programs, including standard AI industry data libraries like Pandas and Numpy.

Support your whole pipeline with just one set of tools

MAX provides a composable set of technologies that optimize your end-to-end inference pipeline, from input processing, to model execution and optimization, to deploying to production.

Mojo
Trained model
MOJO
Custom operations
Cloud
Max Serving
Mojo
Trained model
MOJO
Custom operations
Cloud
Max Serving

Optimize your model input loading and transformations with Mojo 🔥

Models aren’t always the bottleneck — rewrite your data loading and input processing (e.g., tokenization) with high-performance Mojo code and get more out of your models.

Discover Mojo 🔥
Mojo
Trained model
MOJO
Custom operations
Cloud
Max Serving

Execute any model on any hardware with SOTA performance on the  Engine

Execute models from any AI framework (e.g., PyTorch) on any AI hardware (e.g., AMD CPU) with the MAX Engine to achieve unparalleled out-of-the-box latency and throughput wins.

Discover MAX Engine
Mojo
Trained model
MOJO
Custom operations
Cloud
Max Serving

Extend your models with custom operations using Mojo 🔥

Use Mojo to extend your model with custom operations, that MAX Engine can then natively analyze and fuse, creating a highly-optimized model for incredible speed.

Discover Mojo 🔥
Mojo
Trained model
MOJO
Custom operations
Cloud
Max Serving

Streamline deployment to any cloud service with Serving

Deploy MAX Engine into any cloud service with full interoperability with existing inference systems, including Triton, with support for dynamic batching, load balancing, and more.

Discover MAX Serving

Optimize your model input loading and transformations with Mojo 🔥

Models aren’t always the bottleneck — rewrite your data loading and input processing (e.g., tokenization) with high-performance Mojo code and get more out of your models.

Discover Mojo 🔥

Execute any model on any hardware with SOTA performance on the  Engine

Execute models from any AI framework (e.g., PyTorch) on any AI hardware (e.g., AMD CPU) with the MAX Engine to achieve unparalleled out-of-the-box latency and throughput wins.

Discover MAX Engine

Extend your models with custom operations using Mojo 🔥

Use Mojo to extend your model with custom operations that MAX Engine can natively analyze and fuse, creating a highly-optimized model for incredible speed.

Discover Mojo 🔥

Streamline deployment to any cloud service with Serving

Deploy the MAX Engine into any cloud service with full interoperability with existing inference systems (e.g., Triton) including support for dynamic batching, load balancing, and more.

Discover MAX Serving

Modular
Accelerated
Xecution

MAX can be downloaded free for local development and experimentation, and deployed through our Cloud SaaS for production usage.

License
Non-commercial usage
Production usage
Pricing
Free
Consumption based
Availabiity
Download now
Early access
Get StartedContact Sales

Ready to get started?

Sign up to gain access to Modular’s infrastructure.

Read the docs