A peek inside the rust bumpalo crate
Sat, Aug 30 2025
Using the rust bumpalo crate has opened my eyes to what you can do with alternative memory allocation strategies. I usually don't think to look at the code of low-level libraries; I'm afraid I won't understand. But I'm trying to get rid of that habit. Humans wrote this code after all, surely I can understand it given enough time and patience.
Why bumpalo?
The problem
I briefly covered that in a previous note. I needed a collection of nodes where (among other things) each node keeps track of its siblings. I chose to do this using shared references; each node stores shared references to its next and previous sibling. These references need to point somewhere, so I need a collection that owns the actual nodes:
use std::cell::Cell;
#[derive(Default)]
struct MyNode<'a> {
next: Cell<Option<&'a MyNode<'a>>>,
prev: Cell<Option<&'a MyNode<'a>>>,
}
fn main() {
let mut nodes = vec![
MyNode::default(),
MyNode::default(),
MyNode::default(),
];
// We can manipulate the pointers!
// This is the begining of a nice tree API!
&nodes[0].next.set(Some(&nodes[1]));
&nodes[1].prev.set(Some(&nodes[0]));
&nodes[1].next.set(Some(&nodes[2]));
&nodes[2].prev.set(Some(&nodes[1]));
// Uh oh. I guess not
// Error: cannot borrow `nodes` as mutable because it is also borrowed as immutable
// nodes.push(MyNode::default());
}
Everything is great until the last line.
The crux of the problem is that you can't be sure that a pointer to an element in a Vec
continues to be valid after you've added a new element to the Vec
; it could have copied all its elements somewhere else and your pointer now points to garbage.
The rust compiler prevents you from doing this by enforcing that you can't have any existing references into the vec when calling nodes.push
.
Thank you rustc
, but that disrupts my plans >:(
The solution
Thankfully bumpalo
exists.
You can think of it as Vec
that lets you push a new element into it and guarantees that all existing references are stable.
When you push an object into a Vec
, it asks the system memory allocator for a chunk of memory, and stores your object in there.
As you keep pushing, you use up that space.
At some point you won't have any more space.
Vec
handles this by asking the system memory allocator for a bigger chunk of space, copying all existing objects to that new space and appending your new element.
bumpalo
does something different: It asks the system memory allocator for a bigger chunk of space, records the address of the old chunk of space in it, then adds your new element to it.
This means that existing references remain valid! The following compiles:
use std::cell::Cell;
use bumpalo::Bump;
#[derive(Default)]
struct MyNode<'a> {
next: Cell<Option<&'a MyNode<'a>>>,
prev: Cell<Option<&'a MyNode<'a>>>,
}
fn main() {
// The collection of nodes
let nodes = Bump::new();
let a = nodes.alloc(MyNode::default());
let b = nodes.alloc(MyNode::default());
let c = nodes.alloc(MyNode::default());
// We can manipulate the pointers
a.next.set(Some(b));
b.prev.set(Some(a));
b.next.set(Some(c));
c.prev.set(Some(b));
nodes.alloc(MyNode::default());
}
So how does bumpalo do it?
Diagram time. This is what a non-empty Bump
looks like:
+----------+
| metadata |
+----------+ <--------------------------- +
|
+------------------------+----------+ --- +
|___CCCCCCBBBBBBBBBBBAAAA| metadata |
+------------------------+----------+ <--------------------------- +
|
+------------------------------------------------+----------+ ---- +
|_____FFFFFFFFFFFFFFFFFEEEEEEEEEDDDDDDDDDDDDDDDDD| metadata |
+------------------------------------------------+----------+ <--- +
|
+------------------------+ --------------------------------------- +
| let bump = Bump::new() |
+----------------------- +
Our non-empty bump
variable points to a sort of linked list of data + metadata
objects.
In the code, metadata
is called a ChunkFooter
, and is where the book-keeping information is stored.
The cells with letters (e.g. A, B, C) represent arbitrary bytes.
Consecutive cells of the same letter represent one "object" that was allocated in the bump.
Unlike Vec
, Bump
can store any type of object.
What does an empty bump look like?
+----------+
| metadata |
+----------+
It's represented as an "empty" chunk footer.
The first call to Bump::alloc(<blah>)
causes bumpalo to request a chunk of empty space from the system memory allocator.
It creates a new chunk footer with ChunkFooter::prev
pointing to the initial empty chunk footer and writes it to the end of the space it was just given.
+----------+
| metadata |
+----------+ <--------------------------- +
|
+------------------------+----------+ --- +
|________________________| metadata |
+------------------------+----------+
^
^
[metadata.ptr points here now]
The new chunk's ChunkFooter::ptr
is a pointer that starts out pointing to the start of metadata
.
To allocate our object, we calculate how many bytes we need to store it, then decrement ptr by that amount.
We then write our object to the space pointed to by the footer's newly updated ptr
.
+----------+
| metadata |
+----------+ <--------------------------- +
|
+------------------------+----------+ --- +
|__________________AAAAAA| metadata |
+------------------------+----------+
^
^
[metadata.ptr points here now]
If the object we are trying to allocate is larger than the availble space in the current chunk, we allocate another chunk, setting it's ChunkFooter::prev
pointer to the old one.
And so on.
And that's the core of it really as far as I can tell. There is a lot of code to handle proper alignment calculation, optimizations and allocating more complex objects. But at the heart of bumpalo is a linked list of heap-allocated byte buffers. Because objects that are allocated are never moved, even when appending the linked list:
- Bump can add elements without requiring a unique reference
- You can keep shared references to objects in bump for as long as the bump lives.
What are the tradeoffs of this approach?
bumpalo
is not a magical "better Vec
".
- Its construction guarantees that references to items within it remain stable, so you can't de-allocate individual items.
- Because it uses
Cell
, you can't share references toBump
across threads (though you can move the ownedBump
into a thread).
👾