Engineering Articles (X)

View all articles

🚨

NEW

Engineering

Modular natively supports dynamic shapes for AI workloads

Today’s AI infrastructure is difficult to evaluate - so many converge on simple and quantifiable metrics like QPS, Latency and Throughput. This is one reason why today’s AI industry is rife with bespoke tools that provide high performance on benchmarks but have significant usability challenges in real-world AI deployment scenarios.

June 22, 2023

/

Eric Johnson

Kate Caldwell

Read

🚨

NEW

Engineering

The world's fastest unified matrix multiplication

In this post, we describe Modular’s approach to solving this problem and its game-changing benefits, including a new standard in state-of-the-art (SOTA) performance on CPU as compared to existing solutions.

April 20, 2023

/

Abdul Dakkak

Chad Jarvis

Eric Johnson

Hengjie Wang

Ian Tramble

Read

🚨

NEW

Engineering

AI’s compute fragmentation: what matrix multiplication teaches us

AI is powered by a virtuous circle of data, algorithms (“models”), and compute. Growth in one pushes needs in the others and can grossly affect the developer experience on aspects like usability and performance. Today, we have more data and more AI model research than ever before, but compute isn’t scaling at the same speed due to … well, physics.

March 23, 2023

/

Eric Johnson

Abdul Dakkak

Chad Jarvis

Read

🚨

NEW

Engineering

If AI serving tech can’t solve today’s problems, how do we scale into the future?

The technological progress that has been made in AI over the last ten years is breathtaking — from AlexNet in 2012 to the recent release of ChatGPT, which has taken large foundational models and conversational AI to another level.

December 8, 2022

/

Eric Johnson

Tim Davis

Read

🚨

NEW

Engineering

Part 2: Increasing development velocity of giant AI models

The first four requirements address one fundamental problem with how we've been using MLIR: weights are constant data, but shouldn't be managed like other MLIR attributes. Until now, we've been trying to place a square peg into a round hole, creating a lot of wasted space that's costing us development velocity (and, therefore, money for users of the tools).

November 10, 2022

/

Abdul Dakkak

Eric Johnson

Read

🚨

NEW

Engineering

Increasing development velocity of giant AI models

Machine learning models are getting larger and larger — some might even say, humongous. The world’s most advanced technology companies have been in an arms race to see who can train the largest model (MUM, OPT, GPT-3, Megatron), while other companies focused on production systems have scaled their existing models to great effect. Through all the excitement, what’s gone unsaid is the myriad of practical challenges larger models present for existing AI infrastructure and developer workflows.

August 12, 2022

/

Eric Johnson

Read

🤔

No results for this query