Essentials of Garbage Collectors
Essentials of Garbage Collectors
Memory leaks and dangling pointers are the main issues of the manual memory management. You delete a parent node in a linked list, forgetting to delete all its children first — and your memory is leaking. You delete an object chain in correct order — but suddenly your program crashes since you forgot about second owner of this resource, which now tries to dereference a null-pointer.
To avoid these issues, most of the modern high-level programming languages implement automatic memory management. You allocate objects manually, however don’t bother with their deallocation: a special program, garbage collector, knows how to automatically deallocate them correctly, and reclaim for future reuse.
In the Essentials of Garbage Collectors class we study all different techniques and algorithms related to the automatic memory management, which are used today on practice.
Who this class is for?
First of all, for compiler engineers.
In implementing your programming language, there is a very high chance you’ll need to implement a garbage collector. Even languages which initially were positioned as “memory-safe”, such as Rust, eventual implemented automatic reference counting (ARC) and other collectors.
To reiterate: in most of the modern high-level programming languages, a garbage collector module (or multiple GC modules, like in Java) is pretty much a requirement today.
What if I don't implement programming languages every day?
If you are not a compiler engineer, then the class can still be interesting for you. Implementing a garbage collector or a memory manager in general, is a pretty advanced engineering task. It's a simple trick: you take some complex project (such as a garbage collector, compiler, interpreter, etc), and while building it, you learn all different data structures and algorithms. And then come back to "every-day programming", improved as a better engineer, with the transferable generic knowledge of complex systems.
Do I need C or C++ for this project?
Not really. Of course, C and C++ are probably the best languages for raw memory manipulations and fit well here, however in the course we study generic design algorithms and focus mainly on theoretical aspects of garbage collectors and memory allocators. This means you can implement them in any language you want. For example, you can allocate an `ArrayBuffer` in JavaScript for a virtual heap, or similarly `bytearray` in Python, Rust, etc.
Most of the algorithms in the course are described in generic pseudo-code, so you can port them to any language.
What's specific in this class?
The main things of these lectures are:
Concise and straight to the point. Each lecture is self-sufficient, concise, and describes information directly related to the topic, not distracting on unrelated materials or talks.
Animated presentation combined with live-editing notes. This makes understanding of the topics easier, and shows how (and when at time) the object structures are connected. Static slides simply don’t work for a complex content.
Reading materials
As further reading and additional literature for this course the following books are recommended:
The Garbage Collection Handbook: The Art of Automatic Memory Management by Antony Hosking, Eliot Moss, and Richard Jones
The Compiler Design Handbook: Optimizations and Machine Code generation by Y.N. Srikant, Priti Shankar
You will learn:
Static, Stack, Heap allocations
Compile vs. Runtime allocation
Fixed size vs. Dynamic size allocation
Recursion
Stack frame
Stack overflow
Memory leaks, dangling pointers
You will learn:
Memory leaks: forget to deallocate
Dangling pointers: deallocate too early
Human factor in memory management
You will learn:
Memory alignment
Object header
Collector meta-information
You will learn:
Virtual address space
Kernel space vs. User space
Code segment, data segment
Stack and stack pointer
Heap and break pointer
Memory mapping
Memory management unit (MMU)
Translation lookaside buffer (TLB)
You will learn:
Life cycle of a GC'ed program
Mutator: user program
Allocator: memory allocation
Collector: garbage collector
Mutator vs. Collector object graph view
"Stop the World" state
System calls to invoke GC from Mutator
You will learn:
Sequential (aka "Bump") allocator
Free-list allocator
First fit, Next-fit, Best-fit searches
Segregated list
Blocks coalescing and splitting
Lab session to implement `malloc` and `free`
Lab session and example of implementation: http://dmitrysoshnikov.com/compilers/writing-a-memory-allocator/
You will learn:
Semantic: objects that will not be reached
Syntactic: object that cannot be reached
Example of Stack structure
Example of invalidated cache
You will learn:
Tracing: trace for alive objects
Direct: directly identify the garbage
Main four collection algorithms
You will learn:
Mark phase: objects trace
Sweep phase: reclamation
Detailed algorithm description with code example
Non-moving collector
Free-list allocation
Heap fragmentation
Example of implementing a Mark-Sweep GC in C++: http://dmitrysoshnikov.com/compilers/writing-a-mark-sweep-garbage-collector/
You will learn:
Mark phase: objects trace
Compact phase: objects relocation
Fixing fragmentation issue
Moving collector
Forwarding address
Lisp 2 algorithm detailed description
Sequential allocator
You will learn:
Semi-space collector
Heap reservation
Forwarding address
Objects evacuation
Fixing child pointers
Bump-allocator
Detailed algorithm description with code example
You will learn:
Directly identify the garbage
Strong vs. Weak references
Deeply integrated into Mutator
Barrier pointer operations
Pauses on deep nested structures removal
Cyclic references
Detailed algorithm description with code example
You will learn:
Heap partitioning
Young vs. Old generations
Minor vs. Major collection cycles
Objects promotion
Intergenerational pointers
Write barrier
You will learn:
Mark-Region collector
Mark-Sweep + Copying + Heap partitioning
Immix collector
Heap segregation: Blocks and Lines
Free, Recyclable, Occupied blocks
Bump-allocation in recyclable blocks
Sweep-to-region technique
Heap defragmentation
Evacuation from Source to Destination blocks
You will learn:
"Stop the World" state
Multiple collection threads
Collector interruption by Mutator
Concurrently running collection thread
Synchronization barriers
You will learn:
Black, Gray, and White objects
Collection time slices
Illegal BW-pointers
Legal BW-pointers
GC barriers
You will learn:
Illegal BW-pointers
Read and Write barriers
Snapshot-at-Beginning
Incremental update (Dijkstra and Steel's)
Baker's Copying method
Brook's method
Final notes
Java and HotSpot VM
Different GC types in Java
Generational collector
Minor and Major collection cycles
Intergenerational pointers
Write barrier
Card table
CMS heap layout
G1 heap layout
Region-based collection
Eden, Survivor, Old and Humongous regions
Stop The World state
Parallel collectors
Evacuation and promotion
Concurrent marking
Tri-color abstraction
GC barriers
Snapshot-at-beginning barrier
Mixed GC cycle
OpenCourser helps millions of learners each year. People visit us to learn workspace skills, ace their exams, and nurture their curiosity.
Our extensive catalog contains over 50,000 courses and twice as many books. Browse by search, by topic, or even by career interests. We'll match you to the right resources quickly.
Find this site helpful? Tell a friend about us.
We're supported by our community of learners. When you purchase or subscribe to courses and programs or purchase books, we may earn a commission from our partners.
Your purchases help us maintain our catalog and keep our servers humming without ads.
Thank you for supporting OpenCourser.