eze anyanwu

A peek inside the rust bumpalo crate

Sat, Aug 30 2025

Using the rust bumpalo crate has opened my eyes to what you can do with alternative memory allocation strategies. I usually don't think to look at the code of low-level libraries; I'm afraid I won't understand. But I'm trying to get rid of that habit. Humans wrote this code after all, surely I can understand it given enough time and patience.

Why bumpalo?

The problem

I briefly covered that in a previous note. I needed a collection of nodes where (among other things) each node keeps track of its siblings. I chose to do this using shared references; each node stores shared references to its next and previous sibling. These references need to point somewhere, so I need a collection that owns the actual nodes:

use std::cell::Cell;

#[derive(Default)]
struct MyNode<'a> {
    next: Cell<Option<&'a MyNode<'a>>>,
    prev: Cell<Option<&'a MyNode<'a>>>,
}

fn main() {
    let mut nodes = vec![
        MyNode::default(),
        MyNode::default(),
        MyNode::default(),
    ];

    // We can manipulate the pointers!
    // This is the begining of a nice tree API!
    &nodes[0].next.set(Some(&nodes[1]));
    &nodes[1].prev.set(Some(&nodes[0]));
    &nodes[1].next.set(Some(&nodes[2]));
    &nodes[2].prev.set(Some(&nodes[1]));
    
    // Uh oh. I guess not
    // Error: cannot borrow `nodes` as mutable because it is also borrowed as immutable
    // nodes.push(MyNode::default());
}

Everything is great until the last line. The crux of the problem is that you can't be sure that a pointer to an element in a Vec continues to be valid after you've added a new element to the Vec; it could have copied all its elements somewhere else and your pointer now points to garbage. The rust compiler prevents you from doing this by enforcing that you can't have any existing references into the vec when calling nodes.push.

Thank you rustc, but that disrupts my plans >:(

The solution

Thankfully bumpalo exists. You can think of it as Vec that lets you push a new element into it and guarantees that all existing references are stable.

When you push an object into a Vec, it asks the system memory allocator for a chunk of memory, and stores your object in there. As you keep pushing, you use up that space. At some point you won't have any more space.

Vec handles this by asking the system memory allocator for a bigger chunk of space, copying all existing objects to that new space and appending your new element.

bumpalo does something different: It asks the system memory allocator for a bigger chunk of space, records the address of the old chunk of space in it, then adds your new element to it.

This means that existing references remain valid! The following compiles:

use std::cell::Cell;
use bumpalo::Bump;

#[derive(Default)]
struct MyNode<'a> {
    next: Cell<Option<&'a MyNode<'a>>>,
    prev: Cell<Option<&'a MyNode<'a>>>,
}

fn main() {
    // The collection of nodes
    let nodes = Bump::new();
    
    let a = nodes.alloc(MyNode::default());
    let b = nodes.alloc(MyNode::default());
    let c = nodes.alloc(MyNode::default());

    // We can manipulate the pointers
    a.next.set(Some(b));
    b.prev.set(Some(a));
    b.next.set(Some(c));
    c.prev.set(Some(b));
    
    nodes.alloc(MyNode::default());
}

So how does bumpalo do it?

Diagram time. This is what a non-empty Bump looks like:

+----------+
| metadata |
+----------+ <--------------------------- +
                                          | 
+------------------------+----------+ --- +
|___CCCCCCBBBBBBBBBBBAAAA| metadata |
+------------------------+----------+ <--------------------------- +
                                                                   |
+------------------------------------------------+----------+ ---- +
|_____FFFFFFFFFFFFFFFFFEEEEEEEEEDDDDDDDDDDDDDDDDD| metadata | 
+------------------------------------------------+----------+ <--- +
                                                                   |  
+------------------------+ --------------------------------------- +
| let bump = Bump::new() |
+----------------------- +

Our non-empty bump variable points to a sort of linked list of data + metadata objects. In the code, metadata is called a ChunkFooter, and is where the book-keeping information is stored.

The cells with letters (e.g. A, B, C) represent arbitrary bytes. Consecutive cells of the same letter represent one "object" that was allocated in the bump. Unlike Vec, Bump can store any type of object.

What does an empty bump look like?

+----------+
| metadata |
+----------+

It's represented as an "empty" chunk footer.

The first call to Bump::alloc(<blah>) causes bumpalo to request a chunk of empty space from the system memory allocator. It creates a new chunk footer with ChunkFooter::prev pointing to the initial empty chunk footer and writes it to the end of the space it was just given.

+----------+
| metadata |
+----------+ <--------------------------- +
                                          |
+------------------------+----------+ --- +
|________________________| metadata |   
+------------------------+----------+
                         ^
                         ^
      [metadata.ptr points here now]

The new chunk's ChunkFooter::ptr is a pointer that starts out pointing to the start of metadata. To allocate our object, we calculate how many bytes we need to store it, then decrement ptr by that amount. We then write our object to the space pointed to by the footer's newly updated ptr.

+----------+
| metadata |
+----------+ <--------------------------- +
                                          |
+------------------------+----------+ --- +
|__________________AAAAAA| metadata |   
+------------------------+----------+
                   ^
                   ^
      [metadata.ptr points here now]

If the object we are trying to allocate is larger than the availble space in the current chunk, we allocate another chunk, setting it's ChunkFooter::prev pointer to the old one. And so on.

And that's the core of it really as far as I can tell. There is a lot of code to handle proper alignment calculation, optimizations and allocating more complex objects. But at the heart of bumpalo is a linked list of heap-allocated byte buffers. Because objects that are allocated are never moved, even when appending the linked list:

What are the tradeoffs of this approach?

bumpalo is not a magical "better Vec".

👾