Lecture 2: References
Contents
Ownership and References
We recommend that you read the rust book section about ownership and the book section about references to learn about this. In the lecture we discuss it at length, but the rust book already contains an excellent written explanation of the ownership rules (and why ownership rules!).
However, here is a summary of ownership rules and references:
- Ownership:
- Each value in Rust has an owner, in the form of a variable binding.
- A value can have only one owner at a time.
- When the owner goes out of scope, the value will be dropped.
- Ownership can be moved to another owner using assignments or function calls
- References:
- You can reference a value using
&
or&mut
. This borrows the value, but does not transfer ownership. - A reference cannot keep existing after the owner is dropped or moved.
- While a mutable reference to a value exists (which is until the mutable reference is dropped), no other references to the value can exist.
- You can reference a value using
Traits
In lecture 5, we will talk about traits in much more detail. That mean's that although information stated here will be true, they may not be the whole truth.
A trait is something that marks a type. For example, a trait may indicate that values of a certain
type can be copied, or compared to one another for equality. If a type is marked by a trait, it is said
that that type implements the trait. For example, it is possible to determine whether two integers
are equal. Therefore, integers implement Eq
. However, floating point numbers do not implement Eq
.
The reason for that is, among other reasons, that there are multiple different bit patterns that all mean
Not a Number for floats.
For this lecture, only a few traits are important. Clone
, Copy
and Sized
.
If a type implements Clone
, it is possible to duplicate a value of that type. For example:
fn main() { let a = 3; let b = a.clone(); // both a and b are usable now (they are both 3) }
Both a and b contain the value 3, and they both own that value. Of course, they do own a
different copy of the variable since a value can have only one owner. If a type implements
Clone
, it has a method called .clone()
which can be used to clone the value.
Some types that implement Clone
, also implement Copy
. If a type implements Copy
, it signifies
to the rust compiler that cloning the type is trivial. For example, cloning an integer is trivial. You
do that all the time by moving it around. Types that implement Copy
can be moved around freely.
In the example above, the a.clone()
is not necessary. a
is an integer, so
fn main() { let a = 3; let b = a; // a and b are both still usable now, since integers implement Copy }
leaves both a 3 in a and in b.
Not all types that are Clone
, are also Copy
. Most structs are not Copy
. Let's take a Vec
for example:
fn main() { let a = Vec::new(); let b = a; // only b is usable now, a is moved into b. }
The same code as before, now leaves a
unusable (the compiler will complain if you use a
) after a
is assigned to b
.
Vec
is not Copy
so simply assigning it does not copy it. It moves the ownership from a to b. If you want
both a and b to own the same vec, you need to use .clone()
.
fn main() { let a = Vec::new(); let b = a.clone(); // both a and b are usable now }
This makes the fact that you are cloning explicit. Cloning a Vec
may take a considerable amount of time
if the Vec
is large. If the compiler were to do it in the background, you may get weird performance issues. Instead,
you need to explicitly say when you want to clone a Vec
, so you know at which points you're paying the performance cost.
Do note that clone
ing itself is not bad. Sometimes you need to, and usually it's not actually that slow.
To see what traits a type implements, you can go to the type's documentation page. For example for the Vec
:
https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#trait-implementations. You will see
a line like impl<T> Clone for Vec<T> where T: Clone ...
. That means that a Vec
containing a type T
implements Clone
only if it is possible
to Clone
that type T
, which makes sense.
Lastly, there are types that are Sized
, which we will talk about in the next section.
Sized data
Some types in rust are Sized
. Actually, many types are Sized
. A type implements
Sized
if that type has a size known at compile time. A struct automatically
implements Sized
if all of its members also implement Sized
. That makes sense,
if all members have a known size at compile time, the struct's size is simply the sum
of the members1.
Almost all types implement Sized
. For example, integers, floats, booleans, Vec
and most structs and references to types.
So what types don't implement Sized
? One example is the slice type. You can read a lot
more about it in the rust book. You write
the slice type as [T]
. That looks a lot like the type of an array: [T; n]
where n is the length.
A slice is an array of unknown length. Therefore, we can't know its size at compile time, and thus [T]
can't implement Sized
.
Values of types that don't implement Sized
, can't be stored in variables on the stack. So how do we use a slice? A reference to
any type always implements Sized
. Regardless of whether the type referenced implements Sized
. Thus, we can't say let a: [T]
, but we
can say let a: &[T]
. A reference simply denotes a location in memory. We may not know the length of the array at that location at compile time,
but we can store the location of the data in a variable and pass it around.
Note that when I say that the compiler doesn't know a size at compile time, I don't mean that the size can change constantly, like with a Vec
. Consider the following function:
fn test(a: [u32]) { unimplemented!() } fn main() { test([1, 2, 3]); test([1, 2, 3, 4]); }
test
is called twice. Each time, with a different length array. The size of each array is perfectly known at compile time (3 and 4 elements). But
should the size of a
be in the test
function? 3 or 4 elements?
fn test(a: &[u32]) { unimplemented!() } fn main() { test(&[1, 2, 3]); test(&[1, 2, 3, 4]); }
However, if as above we give test
a reference, we only give test
the location of the array we pass it. So regardless of the length of the array,
what we pass to test
always has the same size.
Slices
Generally, we call a reference to an array (like above) a slice
. A slice comprises two parts. The location the data lives at (like discussed above),
but also the length of that data. This makes it possible to refer to segments of arrays and pass those around. Let's look at another example.
fn remove_first_last(a: &[i32]) -> &[i32] { if a.len() >= 2 { &a[1..a.len()-2] } else { a } } fn main() { let array /*:[i32; 4]*/ = [1, 2, 3, 4]; let result = remove_first_last(&array); println!("{:?}", result) }
This program should be pretty easy to understand. On line 10 we give remove_first_last
a slice (with length 4, and pointing at array
).
However, remove_first_last
doesn't actually remove any elements. It just returns a new slice with a different starting position and length.
result
acts like it's a new array. However, it actually is just a reference to the elements [2, 3]
of the original array
variable. You can still
use both the original array
and result
. However, at this point you can use neither to modify the array. Because remember the rules of borrowing!
There can only be a single mutable reference to a value, and if there is one, there can be no non-mutable references. Because result
references array
,
array cannot be mutated (and the compiler will reject your code if you even try).
And now you may start to understand why this rule exists. Since both result
and array
refer to the same data, if one of the two modifies the array, the other will
immediately notice. This makes your program extremely hard to reason about!
Strings
Rust has a lot of different types that all seem to just mean "a string of text". If you did C before, you may know that it represents all strings as char *
s.
What are all these extra types for in Rust?
Let's start out with the simplest. &str
is pretty much the same thing as a char *
in C, and it will be the string type you will use most. A string literal
has this type, so you can write:
fn main() { let a: &str = "test"; }
There is a difference however. A &str
in Rust is not the same as a &[u8]
like it would be in C. This is because &str
works with UTF-8 encoded unicode data.
Sometimes, you do want to work with just bytes, in which case there's the &[u8]
type. So that covers two string types already.
You may know that in C, you can't always just add more letters to a string. To do that, you may need to use the malloc
function and first find a space
large enough for the letters to fit in. Note that we had a similar problem previously with arrays. We called a resizable array, a Vec
.
Well, we call a resizable &str
a String
! Internally it's pretty much a Vec
of UTF-8 encoded characters. It's allocated on the heap, and automtacally
resizes if you add more data.
And those are all the string types you really need to know about for now. There are more, specifically to interoperate with C code (CStr
, CString
), or to
represent strings received from the operating system (OsStr
, OsString
), but you will probably not need those much in the near future.
Actually, the size of a struct isn't strictly the sum of its members. Usually, some padding bytes are inserted to ensure alignment and optimize access times.