Programming language resources
See all writings || ArchiveResources to getting started with writing Programming languages, Emulators, Assembly and Operating system. PL orginal post by Max Bernstein
Compilers
- Tufts compilers course COMP/CS 181 (2006, but it’s been taught more recently. I should probably ping Sam.)
- Cornell compilers course CS 6120 and interesting approach to project-based learning
- Nora Sandler’s minimal C compiler
- Jack Crenshaw’s let’s build a compiler
- Recursive descent parsing in C. Note that this just verifies the input string, and more has to be done to build a tree out of the input.
- Vidar Hokstad’s Writing a compiler in Ruby, bottom up
- Rui Ueyama’s chibicc, a C compiler in the Ghuloum style
- The Natalie compiler for Ruby
- Compiler passes
- I’ve heard good things about Engineering a Compiler (3rd edition coming soon!)
Lisp specific
- kanaka’s mal
- leo (lwh)’s Building LISP
- Peter Michaux’s Scheme from Scratch
- Daniel Holden’s Build Your Own Lisp
- Anthony C. Hay’s fairly readable Lisp interpreter in 90 lines of C++
- My own Writing a Lisp blog post series
- carld’s Lisp in less than 200 lines of C
- UTexas’s A simple scheme compiler
- Rui Ueyama’s minilisp
- The Bones Scheme compiler
- The lecture notes for a course developing a Ghuloum-style compiler
- Ghuloum implementations
- Abdulaziz Ghuloum’s minimal Scheme to x86 compiler (PDF)
- My adaptation in C (with implementation)
- Let’s build a compiler
- Thorsten Ball’s adaptation
- Nada Amin’s adaptation
- Tao of Mac’s List implementation list
- sectorlisp and sectorlisp2 and lambda calculus in 383 bytes
Runtimes
- munificent’s Crafting Interpreters book
- Mario Wolczko’s CS 294-113, a course on managed runtimes
- My own bytecode compiler/VM blog post
- Justin Meiners and Ryan Pendelton’s Write your own virtual machine
- Maxime Chevalier-Boisvert’s website
- Serge’s toy JVM
- Dragon taming with Tailbiter
- Phil Eaton’s list of JS implementations
- Chris Seaton’s The Ruby Compiler Survey and RubyConf 2021 talk (video) about it
- Laurence Tratt’s “Why aren’t more users more happy with our VMs?” Part 1 and Part 2
- Interesting runtimes
- Russ Cox’s Regular expression matching: the virtual machine approach
- Webkit’s post about the FTL JIT
Runtime optimization
Research around optimizing dynamic languages.
- Efficient implementation of the Smalltalk-80 system
- Stefan Brunthaler’s work
- Optimizing dynamically-typed object-oriented languages with polymorphic inline caches (PDF)
- Garbage collection in a large LISP system
- Urs Hölzle’s thesis, Adaptive Optimization for Self (PDF)
- An inline cache isn’t just a cache
- Baseline JIT and inline caches
- Javascript hidden classes and inline caching in V8
- Basic block versioning
- Stack Caching for Interpreters (PDF)
- Hotspot performance techniques
- Assembly interpreters and follow-up
- Make sure to take a look at “Further Reading”
- A post including a snippet on direct-threaded dispatch in an assembly interpreter
- Stefan Marr’s page about efficient and safe implementations of dynamic languages
- The Wikipedia page for Cheney’s algorithm
- This web page about V8 internals
- Vyacheslav Egorov’s inline cache explanation for JavaScript
- Caio Lima’s inline cache explanation for JSC (with assembly!)
- V8’s blog post about their baseline/template JIT
- Object shapes
- Chris Seaton’s RubyKaigi talk
- Aaron Patterson and Jemma Issroff’s livestream (video)
- Kate Temkin’s QEMU fork with a gadget-based pseudo-JIT and associated Twitter thread
- When pigs fly: optimizing bytecode interpreters
- I particularly like the snippet on bytecode VM traces
- Optimized Python runtimes
- This SSA paper: Simple and Efficient Construction of Static Single Assignment Form (PDF)
- Resources on mechanical sympathy and optimization coaching
- This paper about encoding low-level semantics in a higher-level language for optimizing code: Demystifying Magic: High-level Low-level Programming
- Meta-tracing JITs in native code
- Bump allocators: always bump downwards!
And here are runtime optimization resources that I wrote!
- Inline caching, a post containing a small demo of how to speed up attribute lookups in an interpreter
- Inline caching: quickening, a post about speeding up interpreters using self-modifying bytecode (“bytecode rewriting” or “quickening”)
- Small objects and pointer tagging, a post about speeding up interpreters using pointer tagging and encoding small objects inside pointers
Pointer tagging and NaN boxing
Resources on representing small values efficiently.
- nikic’s Pointer magic…
- Sean’s NaN-Boxing
- zuiderkwast’s nanbox
- albertnetymk’s NaN Boxing
- Ghuloum’s Incremental approach (PDF), which introduces this in a compiler setting
- Chicken Scheme’s data representation
- Guile Scheme’s Faster Integers
- Femtolisp object implementation
- Leonard Schütz’s NaN Boxing article
- Piotr Duperas’s NaN boxing or how to make the world dynamic
Just-In-Time compilers
Small JITs to help understand the basics. Note that these implementations tend to focus on the compiling ASTs or IRs to machine code, rather than the parts of the JIT that offer the most performance: inline caching and code inlining. Compiling is great but unless you’re producing good machine code, it may not do a whole lot.
- Antonio Cuni’s jit30min
- Christian Stigen Larsen’s Writing a basic x86-64 JIT compiler from scratch in stock Python
- Ben Hoyt’s Compiling Python syntax to x86-64 assembly for fun and (zero) profit
- My very undocumented (but hopefully readable) implementation of the Ghuloum compiler
- Matt Page’s template_jit for CPython, which also contains a readable CFG implementation
Assembler libraries
Sometimes you want to generate assembly from a host language. Common use cases include compilers, both ahead-of-time and just-in-time. Here are some libraries that can help with that.
- Tachyon’s x86-64 assembler (JS)
- Higgs’ x86-64 assembler (D), which is based on Tachyon’s
- yjit’s x86-64 assembler (C) from Shopify’s Ruby JIT, which is based on Higgs’
- Dart’s multi-arch assembler (C++) and relevant constants, both of which need some extracting from the main project
- Strongtalk’s x86 assembler (C)
- AsmJit’s multi-arch assembler (C++)
- PeachPy’s x86-64 assembler (Python)
- PPCI’s x86-64 assembler (Python) and other great compiler infrastructure
- My small x86-64 assembler (C), which I forked from the pervognsen’s original (C)
- A guide to using GCC inline assembly
SystemV ABI
This is a sort of grab-bag for helpful or interesting tools for programming language implementation.
- Blinkenlights, a visual x86-64 emulator
- Cosmopolitan libc
- Cosmopolitan ftrace
Game Boy Emulators
- The Pan Docs, which give technical data about the Game Boy hardware, I/O ports, flags, cartridges, memory map, etc
- This excellent explanation of the boot ROM
- This opcode table that details the full instruction set, including CB opcodes
- This full opcode reference for the GBZ80
- The Game Boy CPU manual (PDF)
- The GameBoy memory map
- This blog post that gives a pretty simple state machine for the different rendering steps
- The Ultimate Game Boy Talk (video) by Michael Steil at CCC
- This ROM generator for custom logos
- This sample DAA implementation
- This awesome-gbdev list
- This excellent emulator and debugger
- Another emulator and debugger
- The Game Boy complete technical reference (PDF)
- This Gameboy Overview
- blargg’s test ROMs which have instruction tests, sound tests, etc
- gekkio’s emulator and his test ROMs
- This fairly readable Go emulator, which has helped me make sense of some features
- This fairly readable C emulator
- This fairly readable C++ implementation
- This helpful GPU implementation in Rust
- This reference for decoding GameBoy instructions.
- NOTE: This has one bug that someone and I independently found. The original repo has fixed the bug but not the page linked above.
- This summary blog post explaining GPU modes
- And of course /r/emudev
- DIY emulator/VM resources
This is a potentially fun way to render the screen without SDL, but only for non-interactive purposes.
This YouTube playlist looks like it could be worth a watch, but it’s a lot of hours.
Lists
Communities
Operating System
Assembly language
picking up assembly is something i would like to do ASAP.
PC Assembly Language programming 32bits
Learning assembly for linux-x64 asm
Introduction to Programming Systems by princeton
Assembly Language / Reversing / Malware Analysis / Game Hacking -resources