Nils Hasenbanck is the founder of Tsukisoft GmbH and a senior developer. His passion is building technically elegant, easy to maintain …
Ways not to fight with the 'Borrow Checker'
The problem
The borrow checker is probably the biggest hurdle for newcomers to Rust. Many struggle with it and often give up. This is partly because Rust’s borrow checker is quite unique and partly because many programmers want to adopt concepts from other programming languages statically. However, the longer programmers write in Rust, the less often they see error messages from the borrow checker and after a certain point the borrow checker changes from beeing an “enemy” to a “friend”. I have summarized a few tips here, so that newcomers struggle less with the borrow checker.
Tip 1: Clarify who owns something
Many newcomers make the mistake of structuring their Rust programs like programs in C / C++ or Java. In these languages it is often common to use pointers and references everywhere and generate a web of owners. Rust, however, prefers to have exactly one owner for a memory area. The best structure for program in Rust is that of a tree. While it is possible to break up this structure with tools (for example smart pointers), Rust programs tend to use these less and place more emphasis on clarifying “who owns what” and accurately planning the lifetimes of objects.
This however is not a “quick win” tip and only with more experience in Rust will the programmer begin to adapt his program structures to the circumstances of Rust. The adaptation of the program structure and execution logic to the question “who owns what” is however the most important point to fight around less with the borrow checker and take advantage of the benefits of Rust.
Tip 2: Avoid structures with lifetimes
This is probably one of the biggest pitfalls most programmers of other languages will encounter. Many newcomers use references like pointers or smart pointers in other languages and want to store them in structures:
struct Container<'a> {
a_reference: &'a u32,
b_reference: &'a u32,
}
fn main() {
let a = 1;
let b = 1;
let list = vec![Container {
a_reference: &a,
b_reference: &b,
}];
dbg!(list.len());
}
This is generally possible, however the vector “list” is now afflicted with a lifetime, which will pull itself through the whole program. If this vector is now returned e.g. by a function, problems with lifetimes will start to appear very soon.
In general I recommend: Keep the lifetime of references as short as possible. Structures with lifetimes should only be used if the lifetime of these are clearly defined. This is for example the case, if one processes things in phases (e.g. classically in lexer / parser / compiler) or however the structures are only temporary (e.g. structures for configuring).
If references are to be held nevertheless in structures or containers, then one should use smart pointer such as Rc
or Arc
(see tip 4).
Tip 3: Copy for simple data types
Structures that consist only of simple data types can implement the Copy
trait. Simple data types in this case are the data types that implement the Copy
trait themselves. This should be implemented however only for structures, which can “cheaply” be copied, since otherwise unnecessary copy operations would be executed, which would negatively affect the performance:
#[derive(Debug, Copy, Clone)]
struct Container {
a: usize,
v: usize,
}
fn a_function_that_moves(a: Container) {
dbg!(a);
}
fn main() {
let a = Container { a: 0, v: 0 };
a_function_that_moves(a);
a_function_that_moves(a);
}
Copy is rather to be seen as a simplification, so that a structure must not be copied always with “clone” and the copy is instead done implicitly for the programmer.
Tip 4: Use Rc / Arc
If you want to split objects which lifetime cannot be determined at compile time, you can use smart pointers. Rust then uses reference counting to decide whether an object is still alive or not. Rust offers two implementations for this: Rc
and Arc
. Rc
uses simple reference counting and can be used if only one thread accesses the object. Arc
uses atomic operations to implement reference counting and is used when objects are to be shared across thread boundaries.
However, it should be noted that objects stored in smart pointers cannot be modified with “outer mutabuility”. For this one must use the structures described in tip 5.
Tipp 5: Use Cell, RefCell, Mutex and RwLock
To control access to certain memory areas / objects at runtime, Cell
and RefCell
can be used in the single-threaded context and Mutex
and RwLock
in the multi-threaded context. These containers implement “inner mutability” and allow simultaneous but synchronized access to objects using smart pointers.
This also allows structures to provide an API interface that does not require exclusive mut
access to an object:
#[derive(Debug, Default)]
struct Adder {
value: Cell<u64>,
}
impl Adder {
fn add(&self, value: u64) -> u64 {
let old_value = self.value.get();
self.value.set(old_value + value);
old_value
}
}
fn main() {
// adder don't need to be decalred with `mut`,
// because we use "inner mutability".
let adder = Adder::default();
adder.add(10);
adder.add(12);
dbg!(adder);
}
Tip 6: Arena / Slab
Also, the use of an arena / slab is often appropriate. Instead of references, keys can then be stored. Behind an arena hides a Vec
, in which objects of the same kind are stored in a coherent, continuous memory. Hereby we effectively transfer the examination whether the object still exists into the domain of the run time. This type of memory also has the advantage of being very CPU cache friendly.
Such a container can be built quite easily with a Vec
, or one can use one of the already published crates.
Example using thunderdome
:
let mut arena = Arena::new();
let foo = arena.insert("Foo");
let bar = arena.insert("Bar");
assert_eq!(arena[foo], "Foo");
assert_eq!(arena[bar], "Bar");
Tip 7: Inner structures
Often one has the problem that one iterates over a container mutable in a method of a structure and has to call another method of the same structure. Since one holds mut self
, the borrow checker will throw an error here. This example is just to show the problem in a simple manner:
struct Summer {
list: Vec<u64>,
sum: u64,
}
impl Summer {
fn work(&mut self) {
// This won't work, since we already borrowed self exclusively.
self.list.iter_mut().for_each(|x| *x = self.sum(*x));
}
fn sum(&mut self, x: u64) -> u64 {
// In theory this function could access `list` over
// which wi are already iterating.
self.sum += x;
self.sum
}
}
error[E0500]: closure requires unique access to `*self` but it is already borrowed
--> example\lib.rs:17:39
|
17 | self.list.iter_mut().for_each(|x| *x = self.sum(*x));
| -------------------- -------- ^^^ ---- second borrow occurs due to use of `*self` in closure
| | | |
| | | closure construction occurs here
| | first borrow later used by call
| borrow occurs here
By using inner structures, you can make the access more granular and avoid this error:
struct Summer {
list: Vec<u64>,
inner: InnerSummer,
}
impl Summer {
fn work(&mut self) {
self.list.iter_mut().for_each(|x| *x = self.inner.sum(*x));
}
}
struct InnerSummer {
sum: u64,
}
impl InnerSummer {
fn sum(&mut self, x: u64) -> u64 {
self.sum += x;
self.sum
}
}
Tip 8: And last but not least: Clone
And if everything failes, then, especially in code that is not performance-critical, it is okay to use clone()
.
Because in the end it is better to perform an action ineffectively, instead of not beeing able to perform it at all.