Skip to content

jayesh1126/Java-Compiler-in-Go

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Java → JVM Bytecode Compiler in Go

A compiler written in Go that translates a subset of Java source code directly into valid JVM bytecode (.class files). The compiled bytecode can be executed by any Java Virtual Machine.

Project Overview

This project implements a complete three-stage compiler pipeline:

  1. Lexer - Tokenizes Java source code
  2. Parser - Builds an Abstract Syntax Tree (AST)
  3. Code Generator - Emits valid JVM bytecode

It generates .class files that conform to the Java Class File Format specification and can be executed with the standard java command.

Architecture

Phase 1: Lexical Analysis (LEXER)

File: src/lexer/lexer.go

Converts raw Java source code into a token stream.

Supported tokens:

  • Keywords: class, public, static, void, int, return
  • Operators: +, -, *, /, =
  • Delimiters: {, }, (, ), ;, ,, ., [, ]
  • Literals: Integers, Strings, Identifiers
  • Special: Comments (skipped), Whitespace (skipped)

Phase 2: Syntax Analysis (PARSER)

File: src/parser/parser.go

Builds an Abstract Syntax Tree from the token stream using recursive descent parsing.

Parses:

  • Class declarations
  • Method declarations (with parameters)
  • Variable declarations (int x = value;)
  • Expressions with operator precedence (Binary expressions: +, -, *, /)
  • Method calls (System.out.println(...))
  • Return statements

Phase 3: Code Generation (CODEGEN)

File: src/compiler/codegen.go

Generates valid JVM bytecode with:

  • Constant Pool - Manages strings, integers, class references, method references
  • Method Code Generator - Emits bytecode instructions (IADD, ILOAD, ISTORE, etc.)
  • Class File Writer - Produces valid .class file format with proper headers

JVM bytecode instructions supported:

  • ICONST_n - Load small integer constants
  • BIPUSH - Push byte as integer
  • LDC - Load from constant pool
  • ILOAD/ISTORE - Load/store local integer variables
  • IADD/ISUB/IMUL/IDIV - Arithmetic operations
  • INVOKEVIRTUAL - Call instance methods
  • IRETURN/RETURN - Return from method

Target Version: Java 8 (major version 52) - chosen for broad compatibility. Any newer Java version supports Java 8 bytecode.

Supported Java Features

Working:

  • Class declarations
  • Public static main methods
  • Integer variables and arithmetic expressions
  • Variable assignment
  • Method calls (System.out.println)
  • Comments and whitespace handling
  • Method parameters with types
  • Return statements

Not yet supported:

  • Instance variables/methods
  • Inheritance
  • If/else statements
  • Loops (for, while)
  • Object instantiation (new)
  • Arrays
  • Exception handling
  • Generics
  • Multiple classes per file

How to Build & Run

Prerequisites

  • Go 1.25 or later
  • Java JDK (to run compiled bytecode)

Build the Compiler

go build -o compiler.exe ./src

Compile Java Code

# Compile a Java file
./compiler.exe src/Example.java

# This generates: Example.class

Run the Bytecode

java Example

Example

Input Java file (Test.java):

class Test {
    public static void main(String[] args) {
        int x = 5;
        int y = 3;
        int z = x + y;
        System.out.println(z);
    }
}

Compile and run:

./compiler.exe Test.java
java Test                    # Output: 8

File Structure

.
├── README.md
└── src/
    ├── main.go              # Entry point
    ├── Example.java         # Test input
    ├── test_simple.java     # Minimal test
    ├── ast/
    │   └── ast.go           # AST node definitions
    ├── lexer/
    │   ├── lexer.go         # Lexical analyzer
    │   └── token.go         # Token types & keywords
    ├── parser/
    │   └── parser.go        # Syntax analyzer (recursive descent)
    └── compiler/
        └── codegen.go       # Bytecode generator

Technical Details

Constant Pool

The JVM class file format requires a constant pool—a table of constants (strings, integers, method references, etc.) that the bytecode references by index. This compiler:

  • Automatically deduplicates constants
  • Builds proper UTF-8, Integer, Methodref, and NameAndType entries
  • Handles cross-references between pool entries

Local Variable Management

  • Variables are stored in a symbol table mapping names to local variable indices
  • The code generator tracks max local variable count and adjusts ILOAD/ISTORE instructions

Bytecode Format

The output .class file follows the Java Class File Format specification:

  • Magic number: 0xCAFEBABE
  • Version: Java 8 (major version 52)
  • Constant pool with deduplication
  • Class metadata (flags, this class, super class)
  • Method table with Code attributes
  • Bytecode instructions and exception tables

Future Phases

Phase 3.5: Semantic Analysis

  • Type checking across expressions
  • Variable scope validation
  • Method signature verification

Phase 4: Lowering/IR

  • Convert high-level constructs to stack machine operations
  • Eliminate expression trees to single instructions

Phase 5: Advanced Features

  • If/else conditionals (IFEQ, GOTO)
  • Loops (IFEQ, GOTO jump targets)
  • Instance variables and constructors
  • Inheritance (extends, super calls)
  • Arrays and objects

Author

Jay

About

A compiler written in Go that translates a subset of Java source code directly into valid JVM bytecode (.class files). The compiled bytecode can be executed by any Java Virtual Machine.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors