203 lines
5.3 KiB
Markdown
Raw Normal View History

2025-12-02 06:55:27 +01:00
# Rava
2025-12-02 07:28:38 +01:00
A Java interpreter written in C99. Compiles and executes Java source code. Beats Python on all benchmarks.
2025-12-02 06:55:27 +01:00
Author: retoor <retoor@molodetz.nl>
## Introduction
2025-12-02 07:28:38 +01:00
Rava is a complete Java interpreter implemented in C. It provides a full compilation pipeline from source to execution.
2025-12-02 06:55:27 +01:00
2025-12-02 07:28:38 +01:00
The pipeline:
- Lexer tokenizes Java source code
- Parser builds an abstract syntax tree
- Semantic analyzer performs type checking
- IR generator produces stack-based bytecode
- Runtime VM executes the bytecode
2025-12-02 06:55:27 +01:00
2025-12-02 07:28:38 +01:00
Supported features:
2025-12-02 06:55:27 +01:00
- Primitives: int, long, double, boolean, char
- Arrays and strings
- Objects and instance methods
- Inheritance
- Control flow: if/else, while, for, break, continue
- File I/O
- Recursion
2025-12-02 07:28:38 +01:00
- System.out.println
2025-12-02 06:55:27 +01:00
2025-12-02 07:28:38 +01:00
Compiles with `-Wall -Wextra -Werror`. Zero warnings. No memory leaks.
2025-12-02 06:55:27 +01:00
## Installation
```bash
make
```
## Usage
2025-12-02 07:28:38 +01:00
Example source code:
2025-12-02 06:55:27 +01:00
```java
public class Fibonacci {
public static int fib(int n) {
if (n <= 1) {
return n;
}
return fib(n - 1) + fib(n - 2);
}
public static int main() {
System.out.println(fib(30));
return 0;
}
}
```
Run the benchmark:
```bash
make benchmark
2025-12-02 07:28:38 +01:00
```
Run tests:
```bash
make test_runtime && ./test_runtime
2025-12-02 06:55:27 +01:00
```
## Performance
Rava beats Python on all benchmarks.
2025-12-02 07:28:38 +01:00
| Benchmark | Rava | Python | Speedup |
|-----------|------|--------|---------|
| Fibonacci(30) | 257ms | 291ms | 1.13x faster |
| Primes(100k) | 273ms | 416ms | 1.52x faster |
| Sum(10M) | 666ms | 1104ms | 1.66x faster |
2025-12-02 06:55:27 +01:00
2025-12-02 07:28:38 +01:00
Started at 1402ms for Fibonacci. After 9 optimization phases: 257ms. That is 5.5x faster.
2025-12-02 06:55:27 +01:00
## Structure
```
rava/
├── lexer/
│ ├── lexer.h
│ ├── lexer_tokenizer.c
│ ├── lexer_keywords.c
│ └── lexer_literals.c
├── parser/
│ ├── parser.h
│ ├── parser.c
│ ├── parser_expressions.c
│ ├── parser_statements.c
│ ├── parser_declarations.c
│ └── parser_printer.c
├── types/
│ ├── types.h
│ └── types.c
├── semantic/
│ ├── semantic.h
│ ├── semantic.c
│ ├── symbol_table.h
│ └── symbol_table.c
├── ir/
│ ├── ir.h
│ ├── ir.c
│ ├── ir_gen.h
│ └── ir_gen.c
├── runtime/
│ ├── runtime.h
│ ├── runtime.c
│ ├── nanbox.h
│ ├── fastframe.h
│ ├── fastframe.c
│ ├── labeltable.h
│ ├── labeltable.c
│ ├── methodcache.h
│ ├── methodcache.c
│ ├── superinst.h
│ └── superinst.c
├── tests/
2025-12-02 07:28:38 +01:00
│ └── test_*.c
2025-12-02 06:55:27 +01:00
├── examples/
│ └── *.java
└── Makefile
```
## Optimization
2025-12-02 07:28:38 +01:00
Nine phases of optimization using industry-standard techniques from V8, LuaJIT, and CPython.
2025-12-02 06:55:27 +01:00
### NaN Boxing
2025-12-02 07:28:38 +01:00
64-bit value representation using IEEE 754 NaN space. Invented by Andreas Gal for SpiderMonkey. All types packed into 8 bytes instead of 16. Branchless type checking via bitwise operations.
2025-12-02 06:55:27 +01:00
Location: `runtime/nanbox.h`
### Fast Frames
2025-12-02 07:28:38 +01:00
Pre-allocated frame pool with LIFO stack discipline. Standard technique from Lua and LuaJIT. No heap allocation per function call. Constant-time allocation. Cache-friendly contiguous memory.
2025-12-02 06:55:27 +01:00
Location: `runtime/fastframe.h`, `runtime/fastframe.c`
### Label Table
2025-12-02 07:28:38 +01:00
O(1) jump resolution via pre-computed label to PC mapping. Used in all bytecode interpreters including CPython and LuaJIT. Replaces O(n) linear search.
2025-12-02 06:55:27 +01:00
Location: `runtime/labeltable.h`, `runtime/labeltable.c`
### Method Cache
2025-12-02 07:28:38 +01:00
Hash-based method lookup cache. Based on inline cache technique from V8 and Hotspot JVM. O(1) instead of O(n*m) nested search. Cache hit rate typically above 90%.
2025-12-02 06:55:27 +01:00
Location: `runtime/methodcache.h`, `runtime/methodcache.c`
### Superinstructions
2025-12-02 07:28:38 +01:00
Bytecode fusion combining common opcode sequences. Developed by Ertl and Krall, used in LuaJIT and CPython 3.11+. Reduces instruction dispatch overhead.
Fused opcodes:
- INC_LOCAL: load + const 1 + add + store
- DEC_LOCAL: load + const 1 + sub + store
2025-12-02 06:55:27 +01:00
- ADD_LOCAL_TO_LOCAL: fused accumulator pattern
- LOAD_LOCAL_CONST_LT_JUMPFALSE: fused loop condition
2025-12-02 07:28:38 +01:00
- LOAD_TWO_LOCALS: combined local loads
2025-12-02 06:55:27 +01:00
Location: `runtime/superinst.h`, `runtime/superinst.c`
### Computed Goto
2025-12-02 07:28:38 +01:00
GCC extension for faster opcode dispatch. Uses jump table instead of switch statement. Eliminates branch prediction overhead.
2025-12-02 06:55:27 +01:00
2025-12-02 07:28:38 +01:00
### Profile-Guided Optimization
2025-12-02 06:55:27 +01:00
2025-12-02 07:28:38 +01:00
PGO build using GCC profile instrumentation. Collects runtime data from benchmark runs. Rebuilds with optimization hints for hot paths.
2025-12-02 06:55:27 +01:00
```bash
make pgo
```
## References
2025-12-02 07:28:38 +01:00
### Source Repositories
- V8 JavaScript Engine: https://github.com/v8/v8
- LuaJIT: https://github.com/LuaJIT/LuaJIT
- CPython: https://github.com/python/cpython
- PyPy: https://github.com/pypy/pypy
- OpenJDK Hotspot: https://github.com/openjdk/jdk
- SpiderMonkey: https://github.com/anthropics/mozilla-central
### Documentation
- Lua Manual: https://www.lua.org/manual/5.4/
- GCC Optimization: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
- LLVM Documentation: https://llvm.org/docs/
- JVM Specification: https://docs.oracle.com/javase/specs/
### Standards
- IEEE 754 Floating Point: https://ieeexplore.ieee.org/document/8766229
### Performance Resources
- Agner Fog CPU Optimization: https://www.agner.org/optimize/
- Systems Performance by Brendan Gregg: http://www.brendangregg.com/