Mojo SDK v0.5 is now available for download and includes exciting new features. In this blog post, I’ll discuss what these features are and how to use them with code examples. ICYMI, in last week’s Modular community livestream, we dove deep into all things Mojo SDK v0.5 with live demos of the examples shared in this blog post, while answering your questions live! If you missed it, you should check out the recording here:
VIDEO
And don't forget to register for Modular’s first-ever annual conference, ModCon, happening on Dec 4th in San Francisco. Register now if you want to meet the team in person, get access to workshops and panels, epic swag, and even more surprises! 🔥
Update your Mojo🔥 SDK Before we dive into examples and new features, make sure you’re running the latest version. If you already have Mojo SDK v0.4 installed run modular update mojo to get the latest release. If you don’t have Mojo or have an earlier version follow the getting started instructions in the documentation. For a complete list of what’s new, what changed, and what’s fixed in this release, I recommend reviewing the changelog . In this blog post I’ll focus on the following 5 features:
Keyword parameters and keyword arguments Automatic parameterization of functionsTensor enhancements : load and save to file, print() works on tensor typesString enhancements : new count(), find() and replace() functionsBenchmark enhancements : new print() function to print benchmark reports, and ability to take non-capturing functions Let’s take a closer look at these features with examples. All the code examples shared below are available in a single Jupyter Notebook here. To access it:
Bash
git clone https://github.com/modularml/mojo.git
cd mojo/examples/blogs-videos/
Copy
If you want to follow along, open whats_new_v0.5.ipynb and run the examples in Visual Studio Code or using Jupyter Lab and run each cell as I discuss the features below.
New feature: Keyword parameters and keyword arguments Mojo SDK v0.3 first introduced Python-style keyword arguments to specify argument default values and pass values with keyword argument names. In the current v0.5 release, you can do the same with keyword parameters. If you’re a Python user who’s new to Mojo, you might ask: “Aren’t parameters and arguments the same thing?”. In Mojo🔥 they mean different things. Unlike Python, Mojo is a compiled language, and parameters in Mojo represent compile-time values or types, whereas arguments represent runtime values. In code, we differentiate them by putting compile-time parameters in square brackets and runtime arguments in parentheses.
Below I have a simple Mojo struct called SquareMatrix . A struct in Mojo is similar to a class in Python, but Mojo structs are static and compile-time bound, unlike Python classes that are dynamic and allow changes at runtime. Here the SquareMatrix struct, as the name suggests, creates a square matrix by restricting the shape of the Tensor type during initialization. When initialized, it creates a square matrix with dimension dim , a compile-time value,and fills the matrix with val a run-time value. SquareMatrix also defines a function called print() to print the underlying tensor.
Mojo
from tensor import Tensor
from algorithm import vectorize
struct SquareMatrix[dtype: DType = DType.float32, dim: Int = 4]():
var mat: Tensor[dtype]
fn __init__(inout self, val: SIMD[dtype,1] = 5):
self.mat = Tensor[dtype](self.dim,self.dim)
alias simd_width = simdwidthof[dtype]()
@parameter
fn fill_val[simd_width: Int](idx: Int) -> None:
self.mat.simd_store(idx, self.mat.simd_load[simd_width](idx).splat(val))
vectorize[simd_width, fill_val](self.mat.num_elements())
fn __getitem__(self,x:Int,y:Int)->SIMD[dtype,1]:
return self.mat[x,y]
fn print(self):
print(self.mat)
Copy
Let’s instantiate SquareMatrix with default keyword parameters and print its results. Notice that we didn’t specify any parameters or arguments.
Mojo
SquareMatrix().print()
Copy
Output:
Bash
Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)
Copy
If you take a closer look at the SquareMatrix definition, you’ll see that we use the new keyword parameter feature to specify default keyword parameters: struct SquareMatrix[dtype: DType = DType.float32, dim: Int = 4]()
dtype : with default value is DType.float32 dim : with default value 4 Since we don't provide any keyword arguments or parameters all the default values were assumed. i.e. val = 5.0 , dtype=DType.float32 and dim=4
Notice also that in SquareMatrix 's print() function we’re calling print(self.mat) where self.mat is a Tensor type. In this new release print() function now works on Tensor types!
Let’s try a few different combinations of inputs. We can optimally only specify keyword arguments:
Mojo
SquareMatrix(10).print()
#or
SquareMatrix(val=10).print()
Copy
Output:
Bash
Tensor([[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0]], dtype=float32, shape=4x4)
Copy
Or specify a combination of keyword parameters and keyword arguments
Mojo
SquareMatrix[DType.float64](10).print()
Copy
Output:
Bash
Tensor([[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0]], dtype=float64, shape=4x4)
Copy
And just like for Python arguments you can specify both positional and keyword parameters
Mojo
SquareMatrix[DType.float64,dim=3](1).print()
Copy
Output:
Bash
Tensor([[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0]], dtype=float64, shape=3x3)
Copy
You can also specify keyword arguments in __getitem__() dunder method:
Mojo
let sm = SquareMatrix()
sm.print()
print()
print('Keyword argument in __getitem__()')
print(sm[x=0, y=3])
Copy
Bash
Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)
Keyword argument in __getitem__()
5.0
Copy
New feature: Automatic parameterization of functions Mojo also adds support for automatic parameterization of functions. If you have a function that takes in an argument that has parameters, those parameters are automatically also added to the function as its function parameters. For example:
fn multiply(sm: SquareMatrix, val: SIMD[sm.dtype,1])
Is equivalent to:
fn multiply[dtype: DType = DType.float32, dim: Int = 4](sm: SquareMatrix[dtype: DType, dim: Int], val: SIMD[dtype,1])
Also, notice that function argument input parameters can now be referenced within the signature of the function using sm.dtype since parameters are automatically added to our function. This enables you to write clean-looking code. This feature is better explained with an example, so let’s implement the multiply function that takes SquareMatrix sm and a floating point value val as function arguments, scales all the values in the matrix with val i.e. sm*val and, returns the scaled matrix.
Mojo
from math import mul
fn multiply(sm: SquareMatrix, val: SIMD[sm.dtype,1]) -> Tensor[sm.dtype]:
alias simd_width: Int = simdwidthof[sm.dtype]()
let result_tensor = Tensor[sm.dtype](sm.mat.shape())
@parameter
fn vectorize_multiply[simd_width: Int](idx: Int) -> None:
result_tensor.simd_store[simd_width](idx, mul[sm.dtype,simd_width](sm.mat.simd_load[simd_width](idx),val))
vectorize[simd_width, vectorize_multiply](sm.mat.num_elements())
return result_tensor
fn main():
let sm = SquareMatrix(5)
let res = multiply(sm,100.0)
print(res)
main()
Copy
The multiply function above is automatically parameterized with the parameters of SquareMatrix , so we don’t have to specify them. To access SquareMatrix parameters we can use the SquareMatrix variable: sm.dtype , sm.dim
Output:
Bash
Tensor([[500.0, 500.0, 500.0, 500.0],
[500.0, 500.0, 500.0, 500.0],
[500.0, 500.0, 500.0, 500.0],
[500.0, 500.0, 500.0, 500.0]], dtype=float32, shape=4x4)
Copy
New feature: Tensor and String enhancements The Tensor type in the Mojo standard library allows us to work with n-dimensional arrays and in this release, it supports loading and saving tensors to disk as bytes. String manipulation also gets much easier with new count() , find() and replace() functions. As before, let’s take a look at an example to see how to use them.
We’ll extend our SquareMatrix structure to include two new functions prepare_filename which demonstrates the use of the new String features and load and save functions which demonstrate the use of loading and saving. The code below only shows the additional functions we added to SquareMatrix for the full example refer to the notebook that accompanies this blog post.
Mojo
from tensor import Tensor
from algorithm import vectorize
from time import now
from memory import memcpy
struct SquareMatrix[dtype: DType = DType.float32, dim: Int = 4]():
var mat: Tensor[dtype]
# ...
# ...
# ...
fn prepare_filename(self, fname: String)->String:
var fpath = fname
if fpath.count('.') < 2:
fpath += '.data'
fpath = fpath.replace(".","_"+self.mat.spec().__str__()+".")
if fpath.find('/'):
fpath = './'+fpath
return fpath
fn save(self, fname: String='saved_matrix') raises -> String:
let fpath = self.prepare_filename(fname)
self.mat.tofile(fpath)
print('File saved:',fpath)
return fpath
@staticmethod
fn load[dtype: DType,dim: Int](fpath:String) raises -> Tensor[dtype]:
let load_mat = Tensor[dtype].fromfile(fpath)
let new_tensor = Tensor[dtype](dim,dim)
memcpy(new_tensor.data(),load_mat.data(),load_mat.num_elements())
_ = load_mat
return new_tensor
Copy
Let’s start with saving a Tensor
Mojo
let m = SquareMatrix()
m.print()
let fpath = m.save('saved_matrix')
Copy
Output:
Bash
Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)
File saved: ./saved_matrix_4x4xfloat32.data
Copy
The save() function takes in a file name and calls the prepare_filename() function to convert the file name into a file path with extension and save it to disk using the tofile() function. Note: tofile() does not preserve the Tensor shape, therefore it’s saved as a 1-dimensional tensor. If you know the shape ahead of time, we can reshape it to the original shape as we do in the load() function.
In prepare_filename() we use the new count() , find() and replace() functions. We use
count() to count occurrences of a . in the string to check if the provided filename has an extensionreplace() to replace . with . + tensor shape + tensor's dtypefind() to find if ./ exists and add it to the beginning of the string to save the tensor in the current directory.Note: I created prepare_filename() function purely to demonstrate the new String features. There are likely more easier and efficient ways to do the same without using count() , replace() and find() in the way I do.
Now, let’s load the file we just saved and reshape it to the original shape in SquareMatrix ’s load() function.
Mojo
print('Loading Tensor from file:',fpath)
print()
let load_mat = SquareMatrix.load[DType.float32,4](fpath)
print(load_mat)
Copy
Output:
Bash
Loading Tensor from file: ./saved_matrix_4x4xfloat32.data
Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)
Copy
We defined our load() function as a static method which takes in the type and dimensions as parameters and file path as arguments and uses Tensor’s fromfile() function to load the tensor. To reshape the tensor we created a new tensor with the desired dimensions and copied the data to the new tensor.
New Feature: Benchmark enhancements To demonstrate Benchmark’s new reporting feature, we’ll use a computationally intensive example that calculates the row-wise mean() of a matrix with few rows and large number of columns.
First, we’ll compute this naively using nested loops and then in a performant way by vectorizing across columns and parallelizing across rows. We’ll print benchmark reports for both using the new report printing feature and show speedups.
Mojo
from tensor import Tensor
from random import rand
import benchmark
from time import sleep
from algorithm import vectorize, parallelize
alias dtype = DType.float32
alias simd_width = simdwidthof[DType.float32]()
fn row_mean_naive[dtype: DType](t: Tensor[dtype]) -> Tensor[dtype]:
var res = Tensor[dtype](t.dim(0),1)
for i in range(t.dim(0)):
for j in range(t.dim(1)):
res[i] += t[i,j]
res[i] /= t.dim(1)
return res
fn row_mean_fast[dtype: DType](t: Tensor[dtype]) -> Tensor[dtype]:
var res = Tensor[dtype](t.dim(0),1)
@parameter
fn parallel_reduce_rows(idx1: Int)->None:
@parameter
fn vectorize_reduce_row[simd_width: Int](idx2: Int) -> None:
res[idx1] += t.simd_load[simd_width](idx1*t.dim(1)+idx2).reduce_add()
vectorize[2*simd_width,vectorize_reduce_row](t.dim(1))
res[idx1] /= t.dim(1)
parallelize[parallel_reduce_rows](t.dim(0),t.dim(0))
return res
fn main():
let t = rand[dtype](1000,100000)
var result = Tensor[dtype](t.dim(0),1)
@parameter
fn bench_mean():
_ = row_mean_naive(t)
@parameter
fn bench_mean_fast():
_ = row_mean_fast(t)
let report = benchmark.run[bench_mean]()
let report_fast = benchmark.run[bench_mean_fast]()
report.print()
report_fast.print()
print("Speed up:",report.mean()/report_fast.mean())
main()
Copy
Benchmark can now print easy to read reports with average time, total time, iterations, and other useful benchmark details. On my Apple M2 Pro with 12 cores, I see a ~52x speedup with vectorization and parallelization implementation vs. naive nested-loop implementation, both in pure Mojo.
Output:
Bash
---------------------
Benchmark Report (s)
---------------------
Mean: 0.360315
Total: 1.080945
Iters: 3
Warmup Mean: 0.36441600000000002
Warmup Total: 0.72883200000000004
Warmup Iters: 2
Fastest Mean: 0.360315
Slowest Mean: 0.360315
---------------------
Benchmark Report (s)
---------------------
Mean: 0.006859210256410256
Total: 1.3375459999999999
Iters: 195
Warmup Mean: 0.010933
Warmup Total: 0.021866
Warmup Iters: 2
Fastest Mean: 0.0068472000000000003
Slowest Mean: 0.0068707272727272723
Speed up: 52.530099899367947
Copy
But wait, there is more! In this blog post, I shared several new features but there’s more! This release also includes enhancements to SIMD type, TensorShape and Tensor Spec, and a host of bug fixes. Check out the changelog for a full list of what’s new, what’s changed, and bug fixes in this release: https://docs.modular.com/mojo/changelog.html
And, don’t forget to watch the recording of our Mojo SDK v0.5 demo and Q&A livestream with Modular engineers. If you prefer in-person meetings to virtual livestreams, come to ModCon to meet the Modular team and leading AI experts to discuss the future of AI development and deployment at our annual developer conference.
Until next time! 🔥