Rava

A Java interpreter written in C99. Compiles and executes Java source code. Beats Python on all benchmarks.

Author: retoor retoor@molodetz.nl

Introduction

Rava is a complete Java interpreter implemented in C. It provides a full compilation pipeline from source to execution.

The pipeline:

Lexer tokenizes Java source code
Parser builds an abstract syntax tree
Semantic analyzer performs type checking
IR generator produces stack-based bytecode
Runtime VM executes the bytecode

Supported features:

Primitives: int, long, double, boolean, char
Arrays, strings, and array initializers
Objects, instance methods, and instanceof
Inheritance and interfaces
Control flow: if/else, while, do-while, for, enhanced for-each, switch/case, break, continue
Operators: arithmetic, bitwise (AND, OR, XOR, shifts), ternary (? :)
Exception handling: try/catch/finally, throw
Math functions and String methods
File I/O
Recursion
System.out.println

Compiles with -Wall -Wextra -Werror. Zero warnings. No memory leaks.

Installation

make

Usage

Run a Java file:

./rava file.java
./rava file.java ClassName
./rava file.java ClassName method

Start interactive REPL:

./rava

Example source code:

public class Fibonacci {
    public static int fib(int n) {
        if (n <= 1) {
            return n;
        }
        return fib(n - 1) + fib(n - 2);
    }

    public static int main() {
        System.out.println(fib(30));
        return 0;
    }
}

Run the benchmark:

make benchmark

Run all tests:

make test

Interactive REPL

Rava includes a full-featured interactive interpreter.

$ ./rava
Rava 1.0 Interactive Interpreter
Type "%help" for commands, "%quit" to exit

>>> int x = 10;
>>> int y = 20;
>>> x + y
30
>>> int fib(int n) { if (n <= 1) return n; return fib(n-1) + fib(n-2); }
Method 'fib' defined.
>>> fib(10)
55
>>> class Point { public int x; public int y; public Point(int px, int py) { this.x = px; this.y = py; } }
Class 'Point' defined.
>>> Point p = new Point(3, 4);
>>> p.x
3
>>> %whos
Variable             Type            Value
--------             ----            -----
x                    int             10
y                    int             20
p                    Point           null
>>> %quit

REPL Features

Variable declarations with persistence across executions
Expression evaluation with automatic output
User-defined methods callable after definition
User-defined classes instantiable after definition
Array declarations
Multi-line input with brace/bracket/paren tracking
String methods and Math functions
Control flow statements (for, while, if/else, switch)

Magic Commands

Command	Description
%help	Show help for commands
%whos	List all variables with types
%who	List all variable names
%methods	List session methods
%classes	List session classes
%reset	Clear all session state
%clear	Clear screen
%debug	Toggle debug mode
%history	Show input history
%quit	Exit REPL

REPL Tests

make test_repl

Performance

Rava beats Python on 18 out of 21 benchmarks (85.7% win rate).

Comprehensive Benchmark Results

Benchmark	Rava	Python	Winner	Speedup
Fibonacci(30) recursive	219ms	587ms	Rava	2.68x
Fibonacci(40) iterative	0ms	0ms	Tie	1.00x
Primes (100K)	337ms	685ms	Rava	2.03x
Sum (10M)	690ms	1317ms	Rava	1.91x
Array sum (1M)	161ms	331ms	Rava	2.06x
Array reverse (1M)	258ms	278ms	Rava	1.08x
Nested loops (1000x1000)	86ms	119ms	Rava	1.38x
Factorial(15) recursive	0ms	0ms	Tie	1.00x
Ackermann(3,6)	16ms	84ms	Rava	5.25x
Method calls (10M)	1439ms	2032ms	Rava	1.41x
Arithmetic ops (10M)	1632ms	4259ms	Rava	2.61x
Conditional branches (10M)	1384ms	2035ms	Rava	1.47x
Complex conditions (10M)	2797ms	3369ms	Rava	1.20x
Matrix multiply (50x50)	32ms	31ms	Python	0.97x
Bubble sort (5000)	3405ms	4391ms	Rava	1.29x
Quick sort (1000)	108ms	110ms	Rava	1.02x
Insertion sort (3000)	791ms	1029ms	Rava	1.30x
Binary search (100K)	13ms	16ms	Rava	1.23x
String concat (50K)	19ms	29ms	Rava	1.53x
Bitwise ops (10M)	2735ms	7392ms	Rava	2.70x
Array copy (1M)	328ms	240ms	Python	0.73x

Notable Victories

Ackermann function: 5.25x faster - Deep recursion handling
Bitwise operations: 2.70x faster - Efficient bit manipulation
Fibonacci recursive: 2.68x faster - Optimized function calls
Arithmetic operations: 2.61x faster - Fast numeric computation
Array sum: 2.06x faster - Optimized array access
Primes: 2.03x faster - Efficient loop execution

Started at 1402ms for Fibonacci(30). After optimization: 219ms. 6.4x improvement.

Structure

rava/
├── lexer/
│   ├── lexer.h
│   ├── lexer_tokenizer.c
│   ├── lexer_keywords.c
│   └── lexer_literals.c
├── parser/
│   ├── parser.h
│   ├── parser.c
│   ├── parser_expressions.c
│   ├── parser_statements.c
│   ├── parser_declarations.c
│   └── parser_printer.c
├── types/
│   ├── types.h
│   └── types.c
├── semantic/
│   ├── semantic.h
│   ├── semantic.c
│   ├── symbol_table.h
│   └── symbol_table.c
├── ir/
│   ├── ir.h
│   ├── ir.c
│   ├── ir_gen.h
│   └── ir_gen.c
├── runtime/
│   ├── runtime.h
│   ├── runtime.c
│   ├── nanbox.h
│   ├── fastframe.h/c
│   ├── labeltable.h/c
│   ├── methodcache.h/c
│   ├── superinst.h/c
│   └── gc/
├── repl/
│   ├── repl.h/c
│   ├── repl_session.h/c
│   ├── repl_input.h/c
│   ├── repl_executor.h/c
│   ├── repl_output.h/c
│   ├── repl_commands.h/c
│   ├── repl_history.h/c
│   ├── repl_types.h
│   ├── tests/
│   └── examples/
├── tests/
│   └── test_*.c
├── examples/
│   └── *.java
└── Makefile

Optimization

Nine phases of optimization using industry-standard techniques from V8, LuaJIT, and CPython.

NaN Boxing

64-bit value representation using IEEE 754 NaN space. Invented by Andreas Gal for SpiderMonkey. All types packed into 8 bytes instead of 16. Branchless type checking via bitwise operations.

Location: runtime/nanbox.h

Fast Frames

Pre-allocated frame pool with LIFO stack discipline. Standard technique from Lua and LuaJIT. No heap allocation per function call. Constant-time allocation. Cache-friendly contiguous memory.

Location: runtime/fastframe.h, runtime/fastframe.c

Label Table

O(1) jump resolution via pre-computed label to PC mapping. Used in all bytecode interpreters including CPython and LuaJIT. Replaces O(n) linear search.

Location: runtime/labeltable.h, runtime/labeltable.c

Method Cache

Hash-based method lookup cache. Based on inline cache technique from V8 and Hotspot JVM. O(1) instead of O(n*m) nested search. Cache hit rate typically above 90%.

Location: runtime/methodcache.h, runtime/methodcache.c

Superinstructions

Bytecode fusion combining common opcode sequences. Developed by Ertl and Krall, used in LuaJIT and CPython 3.11+. Reduces instruction dispatch overhead.

Fused opcodes:

INC_LOCAL: load + const 1 + add + store
DEC_LOCAL: load + const 1 + sub + store
ADD_LOCAL_TO_LOCAL: fused accumulator pattern
LOAD_LOCAL_CONST_LT_JUMPFALSE: fused loop condition
LOAD_TWO_LOCALS: combined local loads

Location: runtime/superinst.h, runtime/superinst.c

Computed Goto

GCC extension for faster opcode dispatch. Uses jump table instead of switch statement. Eliminates branch prediction overhead.

Profile-Guided Optimization

PGO build using GCC profile instrumentation. Collects runtime data from benchmark runs. Rebuilds with optimization hints for hot paths.

make pgo

References

Performance Resources

Agner Fog CPU Optimization: https://www.agner.org/optimize/
Systems Performance by Brendan Gregg: http://www.brendangregg.com/