Borrow Checking from Scratch

Let us try to make a C-like language memory safe.

Nov 17, 2024

Anyone who has programmed in Rust has at one point or another “fought the Borrow Checker”1. You write code that appears to be (and possibly is) completely fine but the compiler is unable to statically2 prove that it is memory safe, so it screams at you.

The so called “Borrow Checker” is a part of Rust whose job is to enforce a pair of rules at compile-time:

Any borrow must last for a scope no greater than that of the owner.
You may have any number of immutable borrows to the same data, or a single mutable borrow to that data. 1 writer xor N readers.

But what’s a “borrow”? What’s an “owner”? Why those two rules? Is Rule 2 necessary if we don’t care about multithreading?

I’m not going to explain the rust borrow checker. You can read the Rust documentation if you want a very detailed explanation. Rather, the goal of this post is to naturally arrive at a similar solution by tackling the same problem: memory safety through compile-time checks.

There are two main vulnerabilities we’re trying to prevent: Double-Free (alongside memory leaks) and Use-After-Free. The nastiest vulnerability, Out-of-Bounds Access, is trivial to prevent with minimal runtime support3.

We’ll start from a simple C-like language and add compiler checks (our own “borrow checker” of sorts) to make it memory safe, and hopefully show you how you could have arrived at a similar solution in the process.

Did you know C# has something like a “borrow checker” too4?

Meet Cnile

We’re going to turn a made-up C-like language memory safe. We’ll call it Cnile5.

Cnile is very similar to C with a few minor differences: In Cnile the declaration syntax puts all relevant information next to the type, e.g. int[] x instead of int x[]. All data is initialized to 0. “Constness” is a property of a binding and not of a type, and is transitive.

The most important change is that it supports slices. Arrays don’t decay to pointers, they implicitly cast to slices, and slices track their length:

int[] x = {1, 2, 3, 4};

int example(int[..] y) {
  // will crash the program
  return y[5];
}

example(x);

Slices are equivalent to the following anonymous struct:

{ T* data; size_t length; }

The T[..] syntax was proposed for C by Walter Bright back in 2009. Works well enough. Note that an array parameter would have a different meaning.

The above takes care of bounds checking. If the bounds check is too expensive for whatever reason, just grab the internal pointer and index that.

Memory Safety at Runtime without GC

If we ignore memory leaks, there’s actually a pretty easy way to also handle both double-free and use-after-free at runtime.

If all allocations are stored in arenas/regions, and the smallest arena size is at least one memory page in size, you can just use operating system features to protect the relevant pages after an arena is cleared. Any attempt to access that memory will cause a segmentation fault. There’s barely any performance cost to doing this6.

With a 64-bit address space it’ll take a very very long time before any page needs to be reused, and the chances that an invalid pointer to such an old page survived without being dereferenced is extremely low.

This only works for bulk allocations. If you want memory safety with individual malloc/free style memory management, you can do so with memory tagging.

Instead of a raw pointer, a safe_malloc would return a special reference type, which is equivalent to the following anonymous struct:

{ size_t tag; void* ptr }

The Vale programming language calls these generational references. Each allocation generates and stores a non-zero 64-bit random number tag at the top of the allocation. A simple xorshift generator works great for this.

When dereferencing one of these tagged references, the tag stored in the reference and the tag stored in the allocation are compared. If they don’t match then we’ve detected a use-after-free.

There’s around a 0.0000000000000002% chance of collision. Extremely unlikely. Your odds of winning a lottery that would make you a multi-millionaire are over a billion times higher.

A safe_free function would only accept references like the above, not pointers, and would zero-out the tag in the allocation.

As with bounds checks the check for the tag does not need to be done on every access. As soon as you know the reference is valid, you can use the pointer directly until you call a function that might free the reference.

But we don’t want any runtime costs, we want to do everything at compile-time.

Double-Frees and Memory Leaks

Consider the following data structure:

struct DynArray {
  int* data;
  size_t length;
  size_t capacity;
}

void push(DynArray* a, int i) { 
  if (a->length == a->capacity) {
    a->capacity = (a->capacity + 1) * 2;
    a->data = unsafe_realloc(...);
  }
  a->data[a->length] = i;
  a->length += 1;
}

void delete(DynArray* a) {
  unsafe_free(a->data);
  a->data = null;
  a->length = 0;
  a->capacity = 0;
}

DynArray a = {};
push(&a, 1);
delete(&a);

Our first goal is to ensure that delete is called one time for each DynArray, thereby preventing memory leaks and double-free-related vulnerabilities.

Calling delete on “a” twice is actually perfectly safe, the problem only occurs if we copy “a”. This would result in a shallow copy, as the internal pointer of both arrays would point to the same memory.

Deleting both “a” and a shallow copy of “a” results in a Double-Free. We need to prevent DynArray’s from being shallow copied7. Automatically performing a deep copy would be an option, but Cnile is a systems programming language, we don’t want any stinky hidden operations like that.

First Attempt: No Implicit Copies

Let us add a special annotation to DynArray, which tells the compiler we don’t want any implicit copies of it happening:

[no_copy]
struct DynArray {
  ...
}

DynArray a = {};
push(&a, 1);
DynArray b = a; // error
a = {}; // ok but now we're in trouble
delete(&a) // not freeing the right "a"

We’ve forbidden the compiler from inserting implicit copies of DynArrays. The only way to pass a DynArray around now is through a pointer. We would have to go out of our way to make a function that shallow copies DynArrays to get into trouble again8.

Note: dereferencing a DynArray pointer would create a shallow copy so that is not allowed either, only the individual members can be accessed. If we were bothering with access control, the internal pointer would be private.

We designed delete to be safe to call multiple times, so double-free cannot occur without “cheating” and messing with the pointer inside the array directly, but there are major problems with this solution:

We can’t return a DynArray from a function directly, we have to use an out parameter pointer.
Nothing actually forces us to call delete, so memory leaks are not solved.
We can assign a new value to a DynArray, making the original impossible to delete (another source of memory leaks).

If we forget to call delete, once a DynArray goes out of scope we have a memory leak. We need to ensure delete always gets called.

Second Attempt: Single-Use Types

Let us change our perspective a little bit. Instead of thinking about preventing copies, we enforce instead that a DynArray must be used (without going through a pointer) exactly 1 time. In Computer Science these single-use types are called Linear Types, so we’ll call the annotation linear:

[linear]
struct DynArray {
  ...
}

DynArray a = {};
DynArray b = a; // ok, 1 use
printf("%d\n", a.length) // error, a is used up
// error, we never used (e.g. deleted) b

This might not seem like much of a change, but we’ve gotten quite a lot from it. The line DynArray b = a is now allowed: it shallow copies “a” into “b” but because “a” cannot be used afterwards, there’s no issue, we are only able to call delete on “b”. And we have to either call delete on b, or assign it to something else.

In Rust this is called a “move” but there’s no need to think of it as “moving”. It’s just a side-effect of how the single-use rule works.

To make a.length work without using up “a”, it can be treated as if it were actually:

(&a)->length

Which would be allowed since accessing “a” through a pointer doesn’t consume it.

To make delete actually consume a DynArray, we just need to change it to take a DynArray by value:

void delete(DynArray a) {
  unsafe_free(a.data);
  unsafe_drop(a);
}

Passing a DynArray to delete will shallow-copy it, consuming it in the process and making it inaccessible. Inside delete we only need to free the pointer, no need to zero-out other data since it is no longer valid and cannot be accessed.

The unsafe_drop builtin is used to get rid of “a” inside of delete, since here too we must use it once.

This solution, if we ignore aliasing through pointers, is actually enough to achieve memory safety. The language ParaSail, which lacks pointers, achieves memory safety via a similar approach (though it inserts implicit deep copies and destructor cals).

A “Borrow Checker” is only necessary to retain memory safety in the presence of aliasing. Aliasing is where the challenge is. For example:

DynArray a;
push(&a, 1);
int* x = &a.data[0];
delete(a);
printf("%d\n", *x) // Use-After-Free

We need to prevent the above somehow.

Use-After-Free and Aliasing

We’ve already partially solved Use-After-Free with the previous solution. As long as we don’t actually store a pointer to a DynArray anywhere, there is no way to use it after it has been deleted. The single-use rule takes care of that.

Aliasing due to pointers in function parameters also does not introduce any problems. The pointers are scoped to the function so they’ll be gone by the time the function returns. Returning a pointer from a function is also ok as long as it is immediately passed as an argument to another function, not stored in a variable.

Pointers that are restricted to being used in this manner are called “second-class references” and are a perfectly reasonable and simple solution to the problem. The Hylo language works like this for example.

This is not the approach we will use, we want a bit more control. Before that though there’s still one problem that we need to tackle. Remember how Rust had that weird Rule 2 enforcing 1 writer xor N reader “borrows” (note: borrow=reference)? It’s important regardless of multithreading. Consider:

void concat(DynArray* a, int[..] b) {
  for (int i = 0; i < b.length; i++) {
    push(&a, b[i]);
  }
}

DynArray x = ...;
concat(&x, x.data[0..2]); // Kaboom

If you’ve done serious C or C++ programming the above should immediately trigger PTSD. The v[i..j] syntax is how we create a slice from existing data in Cnile: we’re creating a slice of the first 3 elements of x. So the idea is for the concat call above to append the first 3 elements of x to its end.

The problem is that the push call inside concat might need to reallocate, which may then perform a new allocation, copy the data to it, and free the old allocation. If this happens, the slice is now pointing to freed memory, causing a use-after-free.

Forbidding internal pointers could be an option for a more higher level language design. If all pointers point to the “source”, which is only ever invalidated by a call to a function like delete, then we avoid the issue. Many languages (e.g. Java) lack internal pointers. Not an option for us since we’re Cnile.

There is one more advantage to the 1 writer xor N reader rule (outside of multithreading): it prevents iterator invalidation bugs. Without it we must either allow iterator invalidation bugs (like C++), or catch them at runtime (like Java).

We’ll go with 1 writer xor N readers since it prevents lots of bugs, but this is an interesting design space and I encourage you to think about it.

Safe References

We’ll keep pointers as actual pointers and leave them as “unsafe” constructs in Cnile. As a safe alternative, we’ll provide references that enforce the “borrow checking” rules. Slices will be made safe references as well.

Let us consider concat above, and make a safe version of it with references:

void concat(DynArray& a, int[..] b) {
  for (int i = 0; i < b.length; i++) {
    push(&a, b[i]);
  }
}

DynArray x = ...;
concat(&x, x.data[0..2]); // error

All we’ve done is change * (pointer) to & (safe reference). But now the concat line fails because of the 1 writer xor N reader rule. How do we check this though? Well, internally the compiler is tracking extra information:

void concat(DynArray& <?p1> a, int[..] <?p2> b) {
  for (int i = 0; i < b.length; i++) {
    push(&a, b[i]);
  }
}

DynArray x = ...;
concat(&x <x>, x.data[0..2] <x.data>); // error

Every reference carries alongside it a hidden “path” to the data it refers to. Unlike Rust, but like C#, the user never writes these out themselves, they have no syntax.

The compiler checks the hidden reference paths for overlap. Overlapping paths that have more than 1 writer or a writer and a reader are rejected.

Function parameters have “variables” (marked with ?) for the paths, that are assigned when the function is called. Within concat the two inputs are assumed to be independent, the check happens on each callsite.

The same paths can be used to detect use-after-free:

DynArray  a = {}; // available: {a}
DynArray& x = &a; // available: {a, x <a>}
DynArray  b = a;  // available: {x <a>, b}
concat(x <a>, {1, 2, 3});  // error, a no longer available

Things get trickier when we want to return references from functions, what path should the output reference have in the following example:

int[..] <?> shortest(int[..] <?p1> a, int[..] <?p2> b);

The shortest function returns the slice with the smallest length from the two inputs, meaning the return reference path <?> could end up being either p1 or p2.

In these situations all we can do is a safe approximation, and record both paths:

int[..] <?p1 & ?p2> 
shortest(int[..] a <?p1>, int[..] <?p2> b);

When using this function, the output will only last for as long as either of the inputs lasts, regardless of the actual result:

int[..] x = a[0..2]; // x <a>
int[..] y = b[0..3]; // y <b>
int[..] z = shortest(x, y); // z is x, but has path <a & b>
delete(b);
printf("%d/n", z[0])  // error, b is no longer available

This is the unavoidable limitation with compile-time checking, some things will only be known at runtime, so we need to consider all situations and assume the worst.

Path Precision

For the example above the most specific we can be is the intersection of the two input paths, but the compiler will do this even for cases where it isn’t necessary:

int[..] <?p1 & ?p2> 
find_subslice(int[..] a <?p1>, int[..] <?p2> b);

This function returns a sub-slice of “a” that is equal to “b” (if any), or just returns an empty slice. The result will never point to “b” but the compiler has no way to tell this from the function’s declaration. So we need to provide more information.

In Rust this would be done with lifetime annotations. In C# this is done by explicitly saying which inputs cannot be part of the output. We’ll follow the C# solution:

int[..] find_subslice(int[..] a, [in] int[..] b);

The [in] annotation means that “b” will not be part of the output. In C# the keyword scoped is used instead, but the meaning is the same.

This gets us most of the power of Rust without introducing lifetime annotations. There are two major capabilities that we lose by not having them:

We have no way to state in a function or struct declaration that two references must have overlapping paths. Paths are always allowed to be disjoint.
We have no way to state in a function declaration with a struct that stores multiple references as input, which reference paths will be part of the output.

To make the second case clear, let us talk about reference structs.

Reference Structs

In C#, a struct that contains a “ref” type (like Span) must itself be a “ref” type. In Cnile, we have the same constraint:

[reference]
struct SlicePair {
  int[..] a;
  int[..] b;
}

Structs that contain reference types must themselves be annotated as being references, so the compiler knows it needs to track paths for them.

The struct SlicePair above stores two slices together. What is its path?

int[..] x   = ... // x has path <foo>
int[..] y   = ... // y has path <bar>
SlicePair p = {x, y} // p has path <?>

If we treat a SlicePair as a unit, then “p” must have path <foo & bar>. The fields p.a and p.b, however, could each track individual paths.

Field-level tracking depends on how much information the compiler carries around for the type. For a rather messy example:

int[..] get_x(SlicePair& p);
int[..] get_y(SlicePair& p);

What are the paths inferred by the compiler for get_x and get_y above? It’s a bit weird at first glance:

int[..] <?p1 & ?p2> get_x(SlicePair <?p2> & <?p1> p);
int[..] <?p1 & ?p2> get_y(SlicePair <?p2> & <?p1> p);

There are two paths to consider for “p”. The path to the SlicePair itself (p1) and the path that the SlicePair references (p2).

There’s no way for the compiler to know which of the two paths matters for the output, so if the SlicePair goes out of scope, the output of get_x and get_y is no longer valid even if the actual slices stored in the SlicePair are perfectly valid.

If we had explicit lifetime annotations, the user could be fully specific in this case:

int[..]<'a> get_x(SlicePair<'a,'b> &<'c> p);
int[..]<'b> get_y(SlicePair<'a,'b> &<'c> p);

This is how it works in Rust. Many times the compiler can figure out the lifetime annotations for you, but not always. When it can’t, as in the example above, you better understand how lifetimes annotations work. You also always need to write them out by hand for struct declarations, which gets pretty annoying.

By lacking any syntax for lifetime annotations we avoid functions with complex declarations like the above, but are also more limited in the programs we can express. Personally I prefer to avoid the complexity that lifetime annotations introduce.

Conclusion

Well, there you have it, a memory safe C-like language through the use of compile-time checks (except for bounds checks and escape hatches like raw pointers of course).

Note that we didn’t really mention “ownership”, “moves” and “borrowing” except when contrasting with Rust terminology. We have no destructors9 nor lifetime annotations. There’s quite a bit of design space to work with.

Single-use (linear) types alongside scoped references with implicit path tracking (as done in C#) provide a simpler alternative to Rust’s borrow checker implementation.

Hopefully this clarifies how and why a “borrow checker” works the way it does and the options one has when designing one. Personally I find Vale’s idea of mixing single-use (linear) types with generational references to be a very compelling alternative to borrow checking with minimal runtime overhead.

This used to be explicitly called out in the official rust documentation.

Statically means the check occurs entirely at compile-time. If checks happen at runtime, we say they performed dynamically.

C and C++ are the only languages that refuse to deal with bounds checking properly. Their respective committees should be sued for gross negligence.

It needs one for types like “Span” (a slice type) which introduce so called “internal pointers”. Internal pointers are a total pain in the ass to track in a garbage collector, so C# went with a compile-time solution. There is also a “Memory” type which is similar to Span but instead of storing a pointer + length pair, it stores a GC-friendly reference + offset + length triple, and doesn’t need to be “borrow checked” as a result.

A term of endearment for C developers, it’s not meant to be derogatory. The C equivalent of Rustacean or Gopher, only funnier.

Clearing an arena is not longer just resetting a pointer, it requires obtaining new pages. Since clearing is not done very often, this is not a major cost.

For this sort of datastructure in particular, the chances of someone accidentally making a shallow copy are very very low. But remember we’re trying to solve the general problem, and a more complicated example would make the this blog post harder to follow.

So just don’t do that.

Rust needs destructors because rather than Linear Types it uses Affine Types (i.e., at-most-one-use types). If a variable is not used by the end of its scope, the Rust compiler inserts an implicit destructor call. We could have supported the same if we wanted to.

Burning the Midnight Coffee

Discussion about this post