Garbage collection
Introduction
Your goal in this final assignment is to implement a mark-and-sweep garbage collector for the L3 virtual machine, L3VM.
Initial setup
To start working on the assignment, you should once again fetch the new code from our git repository, and then rebase your branch on ours. As a reminder, this can be done using the following two commands:
$ git fetch origin $ git rebase origin/master
Overview
After rebasing your branch on ours, you will have access to the following new code:
- the phases required to generate assembly code for L3VM, which we will not see in this course, and
- a partial version of the VM, to be completed for this assignment.
There are two versions of the VM: one written in C, the other in Rust. You must choose one of the two VMs to complete, and ignore the other.
C VM
The C VM is in the vm/c
subdirectory. A Makefile
is given to you, which you can use to build the VM.
The Makefile
assumes that you will compile your code with the clang compiler, which we strongly recommend you do, as it provides two "sanitizers" that help tremendously during debugging:
- the address sanitizer, which detects a lot of common memory-related errors,
- the undefined behavior sanitizer, which detects a lot of errors related to C's (unfortunately many) undefined behaviors.
By default, both sanitizers are enabled, and we encourage you to keep them during development, by setting the CFLAGS
variable of the Makefile
to the value of CFLAGS_DEBUG
. Once you're done debugging, you can use the value of CFLAGS_RELEASE
instead, to see how fast the test programs run.
Rust VM
The Rust VM is in the vm/rust
subdirectory. It can be built as usual with the cargo
command. To build the debug version, simply type cargo build
, and to build the faster release version, type cargo build --release
.
To run one of these versions, and to build it before if needed, type cargo run
with or without the --release
flag depending on the version you want to run.
L3VM
(Note: only the C version of the VM is described below, but the Rust one is very similar.)
The L3VM contains two major components:
- the (threaded) interpreter, in
engine.c
, - the memory module, currently in
memory_nofree.c
.
The current version of the memory module allocates but never frees memory. It can therefore run programs with small inputs correctly, but fails on bigger inputs. For example, it can compute the factorial of 40 using the bignums
test:
$ cd vm/c $ make (output elided) $ ./bin/vm ../test/bignums.l3a Factorial of? 40 40! = 815915283247897734345611269596115894272000000000
As described in the lecture, the VM uses a (virtual) 32-bit address space. Its total size, in bytes, is specified on startup using the -m
option. By default, it is 1 000 000, i.e. one megabyte.
When the VM starts, it first loads the code at the beginning of the address space, starting at 0, and then initializes the memory module, passing it the lowest address which can be used for the heap, named heap_start
below. Assuming the default size, the address space therefore looks as follows:
Internally, the VM manipulates two kinds of addresses:
- virtual addresses, which have type
value_t
and are therefore 32 bits, refer to the address space shown above, e.g. address 0 is that of the first instruction, - physical addresses, which have type
value_t*
and are therefore 32 or 64 bits pointers (depending on the host machine), refer to the address space of the host machine on which the VM is running.
Don't be mislead by the "physical" qualifier of the second kind of addresses: they are physical only from the point of view of the VM. From the point of view of the host operating system, they are virtual addresses, but this does not concern us here.
As you will see by looking at the code, the VM uses a mixture of physical and virtual addresses, for performance reasons. For example, the addresses stored in the three base registers (Ib
, Lb
and Ob
) are physical, while those stored in the heap are of course virtual.
The garbage collector
The goal of this assignment is to write a version of the memory
module that collects unreachable blocks using a mark-and-sweep algorithm.
As a starting point, we suggest that you copy the memory_nofree.c
file to a new one called memory_mark_n_sweep.c
and change the Makefile
to use that new memory module instead of the original one. The memory_nofree
module is a good starting point as it already handles some aspects of memory management correctly, e.g. the packing and unpacking of a block's size and tag in its header.
Since you'll often have to translate between physical and virtual addresses, you should also copy the addr_v_to_p
and addr_p_to_v
functions from the engine
module into your memory module. (Don't try to share them between the two modules by removing the static
qualifier, this prevents their inlining and slows down the VM noticeably.)
The bitmap
Since untagged integers can be present in registers, and since registers are stored in the heap, the GC can encounter them during its marking phase. Being untagged, these integers can look like valid heap pointers, and blindly following and interpreting the memory they "point" to as if it was a valid object leads to problems.
To avoid this, the garbage collector must be able to know, given a heap address, whether an allocated block starts at that address or not. To that end, it maintains a bitmap with one bit per word of the heap. When a block is allocated, the bit corresponding to its first word is set in the bitmap.
During the marking phase, the GC can check the bitmap to make sure that a value that looks like a pointer really refers to the beginning of an allocated block, and only follow it in that case. Notice that this doesn't mean that the value is actually a pointer, it could very well be an untagged integer, but at least the GC never treats some random part of the heap like an allocated block, which is what leads to serious problems.
Surprisingly, the bitmap can also be used to store the mark bits during the marking phase! To understand how, notice that at the beginning of the marking phase, all allocated blocks (and only them) have their bit set in the bitmap. Therefore, during the marking phase, the GC can clear the bit corresponding to a block to mark it. That way, at the end of the marking phase, the bits that are 0 correspond to either:
- allocated and reachable blocks, or
- unallocated blocks, i.e. those that were already in the free list when GC started.
Provided that unallocated blocks can be identified, for example with a special tag reserved for them (tag_None
in memory.h
, which is 255), it is then easy for the sweep phase to add them to the free list, along with unreachable blocks.
Notice that the bitmap must be stored in the same memory as the one used for the code and the heap! Therefore the address space of the VM looks as follows if one includes the bitmap (the position of the heap and bitmap could also be swapped):
The size of the bitmap must be 1/32th of the size of the heap, rounded up to the next integer. Beware that correctly computing that size is not completely trivial!
The free list
The blocks of memory that are currently free must be stored in some data structure, known as the free list, in which the allocator looks for blocks to satisfy allocation requests.
The simplest data structure one can use for the free list is a unique, singly-linked list containing all the free blocks. Notice that the links from one block to the next can be stored in the blocks themselves, since they are not used by the program. But beware: this only works if all blocks have at least enough space to store the link! Therefore, the allocator must make sure that even if the program requests blocks that are too small (i.e. of size 0, which happens), it allocates at least enough memory to put the block back in the free list later.
While a unique free list works, it is not very efficient as the allocator has to iterate over its elements on each allocation request, in order to find a free block of an appropriate size.
A better solution is to have several segregated free lists, e.g. 32 of them. The list at index i contains only blocks of size i, except for the last one, which contains all blocks that do not fit elsewhere. This allows allocation to be very fast in the (common) case where the program requests a block of a size that has a dedicated free list, and that free list is not empty.
Splitting and coalescing
During allocation, blocks that are too big compared to the requested size must be split. Beware however of not creating blocks that are too small to be put back in the free list (for the reasons mentioned above) while splitting!
During the sweeping phase, adjacent free blocks must be coalesced into a single, bigger block.
Tips and tricks
Debugging garbage collectors is tricky, as you're about to find out. Therefore, make sure that you think — a lot, and clearly — before writing code!
Also, sprinkle your code with assertions, using the assert
macro, to check all the invariants that you can think of. Don't hesitate to write specific functions that check the most important ones, e.g. that the bitmap is consistent with the contents of the heap.
Finally, be careful not to write code that depends on specific characteristics of the platform on which you test it!
Testing and submission
To test your garbage collector, you can use the compiled programs given in the vm/tests
subdirectory, or compile and run your own tests.
To run the tests, you should install shelltestrunner on your computer. Once this is done, place yourself in the directory containing the tests and launch the following command:
$ shelltest -D VM=../c/bin/vm *.test
to test the C VM, or:
$ shelltest -D VM=../rust/target/release/l3vm *.test
to test the (release version of the) Rust VM.
Notice that 38 out of those 43 tests are so small that they pass even without a GC! What's hard is making the last 5 of them work.
For 4 of the 5 "big" tests that are given, the table below shows the input values that should work correctly both without a GC — using the memory_nofree
module — and with a mark-and-sweep GC, always using the default memory size of 1 Mb. The time taken by our reference C and Rust implementations, which use segregated free lists, to execute the benchmarks on a recent (2020) laptop is also shown and should give you an idea of what to aim for.
Test | no GC | M&S GC | C time (s) | Rust time (s) |
---|---|---|---|---|
bignums |
241 | 5000 | 1.2 | 1.5 |
maze |
7; 1 | 40; 1 | 1.5 | 1.7 |
queens |
11 | 18 | 3.5 | 4.1 |
unimaze |
30; 15; 1 | 80; 55; 1 | 0.05 | 0.06 |
Submission is done as usual through the submission server. Notice that this time you have two tokens at your disposal, and you have to use the right one depending on the version of the VM you want to submit. The command to use to create the archive also differs.
If you worked on the C VM, then create the archive as follows:
$ cd vm/c $ zip -r l3vm-c.zip Makefile src
and submit the Zip archive created by the second command (l3vm-c.zip
).
If you worked on the Rust VM, then create the archive as follows:
$ cd vm/rust $ zip -r l3vm-rust.zip Cargo.toml src
and submit the Zip archive created by the second command (l3vm-rust.zip
).
Grading
A total of 100 points are assigned to this project part, out of a total of 500 for the whole semester.
You can choose to use either a single free list, or segregated free lists in your garbage collector. Notice however that only doing the former will cost you 15 points.
Keep in mind that a working GC with a single free list, which is easier to write, is better than a crashing GC with segregated free lists. Therefore, we suggest you start implementing a single free list, submit that version when it works, and only then start working on segregated free lists.
Your submission will be graded based on:
- its correctness, i.e. whether it passes all the functional tests that were made available to you, and additional ones that we might have,
- whether it uses a single or multiple segregated free lists: as explained above, only using a single free list will cost you 15 points,
- its performance, which should be within an order of magnitude of our reference implementation,
- its style, which will be worth 5 points for this assignment, so please pay attention to the feedback you received for the previous assignments.