June 4, 2024

Deep dive into ownership in Mojo

This post blog is the second part of the series of ownership in Mojo. Please make sure to check out the first part, What Ownership is Really About: A Mental Model Approach, as we will build on concepts developed there. This post serves as accompanying material for the deep dive on ownership by our CEO, Chris Lattner. Be sure to watch the video as well, which covers how ownership is implemented in Mojo's compiler, providing further insights and technical details.

Understanding how ownership works in Mojo is essential to leveraging its memory management capabilities effectively. Mojo ensures memory safety similar to Rust and efficiency of C/C++ by enforcing strict ownership rules. In this part, we will explore how this is achieved through different kinds of values, ownership modifiers, and lifecycle management. 

Value kinds in Mojo

When Mojo compiler parses the code along with the type checker it outputs three kinds of values, each with distinct ownership and reference behavior: 

  • RValue (Owned Value): Represents a unique ownership, i.e., a unique reference or a unique typed pointer with a lifetime. 
  • LValue (Mutable Reference): Represents a mutable reference, allowing read and write access to the original value. 
  • BValue (Immutable Reference): Represents an immutable reference, allowing read-only access to the original value.

Intuitively, these values can be defined using the ownership model we built in the previous section as follows (Mojo-pseudocode):

Pseudocode
alias RValue = UniqueRef[T, Lifetime, Ownership = Unique] alias BValue = ImmutableRef[T, Lifetime, Ownership = Immutable] alias LValue = MutableRef[T, Lifetime, Ownership = Mutable]

These values propagate through the expression trees as the result of the Mojo parse. This is crucial for ensuring that the correct ownership and reference semantics are maintained throughout the program. 

Next, we will see how ownership can be modified in function arguments.

Ownership Modifiers in Function Arguments

Mojo’s argument convention in functions are as follows:

  • borrowed: The function receives an immutable reference (similar to ImmutableRef discussed previously). This means the function can read the original value (it is not a copy), but it cannot mutate (modify) it. Note that this is default in fn foo(x: T) i.e. fn foo(borrowed x: T)
  • inout: The function receives a mutable reference (similar to MutableRef). This means the function can read and mutate the original value (it is not a copy). 
  • owned: This means the function has an exclusive unique reference  (similar to UniqueRef) which can uniquely identify the underlying value. Often, this also implies that the caller should transfer ownership to this function, but that's not always what happens and this might instead be a copy. More on this later.

The first rule to note is:

Rule 1: owned arguments take RValue on the caller side but are LValue on the callee side.

When a function argument is declared as owned, it takes an RValue on the caller side but becomes an LValue on the callee side. This ensures that the function has exclusive access to the value, preventing other references from accessing or modifying it simultaneously. (The following is valid Mojo code):

Mojo
# a function that takes ownership of String fn take_ownership(owned y: String): y = String("Bye!") # y is an LValue here print(y) fn main(): var x = String("hello, world!") take_ownership(x^) # x^ is passed as RValue # print(x) # This would cause a compile-time error as x's ownership has been transferred

In above, String("hello, world!") uses the constructor __init__ to create the RValue where var owns it and is named x. Then in take_ownership(x^), the x^ is passed as an RValue (caller side) whereas it is received as an LValue in the callee side which is x = String(“Bye!”). The output of the program is Bye! as expected.

For the purpose of completeness, we should mention that Mojo’s dataflow analysis determines where the last use of a variable has occurred and it injects its destructor __del__ eagerly. This is known as As Soon As Possible (ASAP) destructor. However, when transfer operator ^ is used, Mojo disables the call to String.__del__ and is correctly delegated to happen after take_ownership is done printing the new assigned value. For more, please check out the death of a value. 

Relations of the value kinds and ownership modifiers

This table depicts how Value kinds behave with function ownership modifiers.

Establish ownership behavior through __copyinit__ and __moveinit__

To implement custom behavior for copying and moving values, Mojo provides the __copyinit__ and __moveinit__ dunder methods, respectively. These methods allow you to define how values are copied or moved, ensuring that ownership is correctly transferred without unintended side effects. For more details, please check out the Mojo manual.

A type in Mojo can be:

Movable

Values that can be moved through __moveinit__, transferring ownership without copying the underlying data through the transfer operator ^. The transfer operator ^ is used to convert a BValue or LValue into an RValue. This is essential for transferring ownership while maintaining the original value’s integrity.

The following code simply prints when __moveinit__ is called:

Mojo
struct MovableOnly: fn __init__(inout self): ... fn __moveinit__(inout self, owned existing: Self): print("Move is called") fn main(): var a = MovableOnly() # var b = a # compiler error: 'Movable' is not copyable because it has no '__copyinit__' var b = a^

In the above example, var b = a fails to compile because that assignment used the __copyinit__ which there is none here. Executing the code, outputs Move is called.

Copyable

Values that can be copied through __copyinit__. The following code simply prints when __copyinit__ is called:

Mojo
from benchmark import keep struct CopyableOnly: fn __init__(inout self): ... fn __copyinit__(inout self, existing: Self): print("Copy is called") fn main(): var a = CopyableOnly() var b = a keep(a) # need to `keep` it otherwise, Mojo compiler sees it's unused and optimizes it away

The above var b = a calls the __copyinit__ and running the code confirms Copy is called. Note that we need to use keep(a) to keep the Mojo compiler away from optimizing it away as it was unused.

An important note is that the transfer operator ^ is still usable here and its behavior is delegated to __copyinit__. We can verify it as follows:

Mojo
fn main(): var a = CopyableOnly() var b = a var c = a^ keep(c)

which outputs:

Output
Copy is called Copy is called

This might be surprising at first! In order for transfer operator to actually move, we need to implement __moveinit__.

Copyable and Movable

Values can implement both __copyinit__ and __moveinit__. The following prints when __copyinit__ or __moveinit__ is called.

Mojo
from benchmark import keep struct CopyableMovable: fn __init__(inout self): ... fn __copyinit__(inout self, existing: Self): print("Copy is called") fn __moveinit__(inout self, owned existing: Self): print("Move is called") fn main(): var a = CopyableMovable() var b = a var c = a^ keep(c)

which outputs:

Output
Copy is called Move is called

Given the above examples, we can deduce the following:

Rule 2: owned argument owns the type if the transfer operator ^ is used, otherwise it copies if the type is Copyable.

The Mojo compiler takes any opportunity to optimize memory management. In the above example, even without specifying var c = a^

Mojo
fn main(): var a = CopyableMovable() var b = a var c = a keep(c)

The Mojo compiler sees that var c = a is the last use of a so it optimizes move over copy.

Output
Copy is called Move is called

Therefore, we can deduce the last rule:

Rule 3: Copy to move optimization if the type is Copyable and Movable on last use.

This is particularly beneficial for large or complex data structures, reducing the overhead associated with copying and enhancing performance. For example, in the following, avoiding extra copy is an important optimization that the Mojo compiler can do easily for us.

Mojo
struct Str: var ptr: DTypePointer[DType.uint8] var sz: Int fn __init__(inout self): self.ptr = DTypePointer[DType.uint8]() self.sz = 0 fn __init__(inout self, str: StringLiteral): self.sz = len(str) self.ptr = DTypePointer[DType.uint8].alloc(self.sz) memcpy(self.ptr, str.as_uint8_ptr(), self.sz) fn __copyinit__(inout self, existing: Self): # deep copy self.sz = existing.sz self.ptr = DTypePointer[DType.uint8].alloc(self.sz) for i in range(self.sz): self.ptr.store(i, existing.ptr.load(i)) print("Copy is called") fn __moveinit__(inout self, owned existing: Self): # shallow ptr copy self.ptr = existing.ptr self.sz = existing.sz print("Move is called") fn __del__(owned self): self.ptr.free() fn main(): var s = Str("hello") var t = s var u = s keep(u)

which the expected output is:

Output
Copy is called Move is called

Immovable

Values that cannot be copied or moved, ensuring that they remain in a fixed memory location. An example of such special immovable types is Atomic which cannot be copied or moved. Such behavior is critical to ensure correctness in multithreaded environments.

Conclusion

In the second part of the ownership series in Mojo, we built on the mental model developed in the first part and provided practical examples to illustrate how ownership works in Mojo. We covered the different kinds of values (BValue, LValue, and RValue) and how they propagate through expressions. We also explained the function argument conventions (borrowed, inout, owned) and demonstrated how these conventions help manage memory safely and efficiently. We concluded with three fundamental rules: 

  • Rule 1: Owned arguments take RValue on the caller side but are LValue on the callee side.
  • Rule 2: Owned arguments own the type if the transfer operator ^ is used; otherwise, they copy the type if it is Copyable. 
  • Rule 3: Copy operations are optimized to move operations if the type is Copyable and Movable and isn’t used anymore, reducing unnecessary overhead. 

Lastly, we emphasized that the main goals of ownership in Mojo are:

  • Memory Safety: Enforcing exclusive ownership and proper lifetimes to prevent memory errors such as use-after-free and double-free. 
  • Performance Optimization: Converting unnecessary copy operations into move operations to reduce overhead and enhance performance. 
  • Ease of Use: Automating memory management through ownership rules and the transfer operator, simplifying development. 
  • Compile-Time Guarantees: Providing strong compile-time guarantees through type-checking and dataflow lifetime analysis, catching errors early in the development process. 

By integrating these principles, Mojo offers a powerful and intuitive framework for memory management, enabling developers to write efficient, safe, and high-performance code. For a deeper understanding of the technical details and implementation of ownership in Mojo, make sure to watch the accompanying video by our CEO, Chris Lattner.

Additional resources:

Report feedback, including issues on our Mojo and MAX GitHub tracker.

Until next time!🔥

Ehsan M. Kermani
,
AI DevRel

Ehsan M. Kermani

AI DevRel

Ehsan is a Seasoned Machine Learning Engineer with a decade of experience and a rich background in Mathematics and Computer Science. His expertise lies in the development of cutting-edge Machine Learning and Deep Learning systems ranging from Natural Language Processing, Computer Vision, Generative AI and LLMs, Time Series Forecasting and Anomaly Detection while ensuring proper MLOps practices are in-place. Beyond his technical skills, he is very passionate about demystifying complex concepts by creating high-quality and engaging content. His goal is to empower and inspire the developer community through clear, accessible communication and innovative problem-solving. Ehsan lives in Vancouver, Canada.