Skip to content

codeFafnir/sdx_hackathon_404_not_found

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Legacy Architecture Modernization Engine

An AI-powered tool that takes a legacy GitHub repository, deeply analyses it using the Nia API, generates a step-by-step migration plan with an LLM, and then executes that plan — writing real file changes, committing them with git, and producing a Markdown migration report.

Built as a three-person parallel hackathon project with a shared JSON contract so each part could be developed independently.


Architecture overview

User / CLI
    │
    ▼
src/cli.py  (lme analyze / execute / report)
    │
    ├──▶  Part 1: Core SDK
    │         src/nia_client/      ← HTTP wrapper for the Nia API
    │         src/models/          ← Shared Pydantic contracts
    │
    ├──▶  Part 2: Architect Agent
    │         src/architect/       ← Analyses repo, generates RefactorPlan
    │
    └──▶  Part 3: Worker Orchestrator
              src/worker/          ← Executes plan, writes files, reports

End-to-end data flow

engine_input.json
        │
        ▼
  ArchitectAgent
        │  (Nia search + grep + LLM)
        ▼
  refactor_plan.json       ← the only coupling point between Part 2 and Part 3
        │
        ▼
   Orchestrator
        │  (git clone + file writes + validation)
        ▼
  file changes on disk
        │
        ▼
   Reporter
        │
        ▼
  migration_report.md

refactor_plan.json is a RefactorPlan Pydantic model serialised as JSON. Part 2 writes it; Part 3 reads it. They never import each other.


Project structure

sdx_hackathon_404_not_found/
├── pyproject.toml
├── .env.example                   # NIA_API_KEY=  LLM_API_KEY=
├── engine_input.json              # Example input for the CLI
├── run_local_test.py              # Standalone offline test (no API keys needed)
├── smoke_test.py                  # Full end-to-end smoke test (needs real keys)
│
├── src/
│   ├── cli.py                     # Typer CLI: analyze, execute, report
│   │
│   ├── models/                    # Part 1 — shared Pydantic contracts
│   │   ├── plan.py                # RefactorPlan, RefactorStep, FileChange, SymbolReference
│   │   ├── input.py               # EngineInput, EngineConfig, ModernizationTarget
│   │   └── analysis.py            # CodebaseProfile, DependencyNode, AntiPattern
│   │
│   ├── nia_client/                # Part 1 — Nia REST API wrapper
│   │   ├── client.py              # NiaClient + MockNiaClient
│   │   ├── indexer.py             # index_repo(), index_doc_url(), wait_for_index()
│   │   └── searcher.py            # search(), grep(), read_file(), get_tree(), github_search()
│   │
│   ├── architect/                 # Part 2 — analysis and planning
│   │   ├── agent.py               # ArchitectAgent: entry point
│   │   ├── analyzer.py            # build_dependency_graph(), detect_patterns(), get_codebase_profile()
│   │   ├── planner.py             # generate_plan() — LLM call + JSON parse + validation
│   │   └── prompts.py             # SYSTEM_PROMPT, format_user_prompt(), format_retry_prompt()
│   │
│   └── worker/                    # Part 3 — execution and reporting
│       ├── orchestrator.py        # Orchestrator: topological run, failure cascade
│       ├── writer.py              # clone_repo(), apply_step(), commit_step()
│       └── reporter.py            # Reporter: generate() + save() -> migration_report.md
│
└── tests/
    ├── fixtures/
    │   └── sample_plan.json       # Realistic 3-step RefactorPlan fixture
    ├── test_nia_client.py         # NiaClient unit tests (respx mocks)
    ├── test_architect.py          # Analyzer, planner, prompts, ArchitectAgent unit tests
    ├── test_writer.py             # writer.py tests using local bare git repos
    └── test_integration.py        # Live Nia API tests + full Part 3 + e2e pipeline

Part 1 — Core SDK

Models (src/models/)

All three parts import these. They are the frozen contracts.

Model Purpose
RefactorPlan Top-level plan: repo, source_id, steps, dependency graph, risk assessment
RefactorStep One atomic unit of work: id, title, depends_on, affected_symbols, changes, validation_queries
FileChange A single file operation: action (create/modify/delete/move), old_content, new_content
SymbolReference A named symbol with its file, line range, and kind (class/function/etc.)
EngineInput CLI input: repo, ref, goal, instructions, API keys, LLM settings
CodebaseProfile Analyzer output fed to the planner: dependency graph, anti-patterns, entry points

Nia client (src/nia_client/)

NiaClient is a thin httpx wrapper around https://apigcp.trynia.ai/v2.

Method Nia endpoint
index_repo(repo) POST /sources
wait_for_index(source_id) GET /sources/{id} (polls until ready)
search(repo, query, mode) POST /search — modes: query, deep, universal
grep(source_id, pattern) POST /sources/{id}/grep
read_file(repo, path, ref) POST /github/read
get_tree(owner, repo, ref) GET /github/tree/{owner}/{repo}
github_search(repo, query) POST /github/search

MockNiaClient is a drop-in that returns deterministic dummy data — used in tests and offline development so no API key is required.


Part 2 — Architect Agent (src/architect/)

Input: EngineInput
Output: refactor_plan.json

ArchitectAgent.analyze(engine_input)
        │
        ├─ NiaClient.index_repo()         → source_id
        │
        ├─ analyzer.get_codebase_profile()
        │       ├─ build_dependency_graph()   grep for cross-file imports
        │       └─ detect_patterns()          anti-patterns, entry points, TODOs
        │
        └─ planner.generate_plan(profile, target, config)
                ├─ prompts.format_user_prompt()     CodebaseProfile → LLM message
                ├─ _call_llm()                      OpenAI / Gemini / Anthropic
                ├─ _parse_and_validate()             JSON → RefactorPlan
                └─ _validate_step_dependencies()    fix broken depends_on refs

The planner supports OpenAI, Gemini, and Anthropic behind a common dispatch table and retries automatically with a corrective prompt if the LLM returns malformed JSON.


Part 3 — Worker Orchestrator (src/worker/)

Input: refactor_plan.json
Output: file changes committed to a cloned repo + migration_report.md

Orchestrator.run()
        │
        ├─ clone_repo()                   git clone to a temp directory
        ├─ topological sort               respects depends_on across steps
        │
        └─ for each step (in order):
                ├─ writer.apply_step()    write new_content / delete / move files
                ├─ writer.commit_step()   git commit with step title as message
                ├─ validate              passed / failed
                └─ if failed → mark all downstream steps as skipped

Reporter(plan, results).save("migration_report.md")

StepResult — the Orchestrator → Reporter contract

{
    "step-001": {
        "status": "passed" | "failed" | "skipped",
        "reason": "",               # empty on pass, error message on fail/skip
        "changes_applied": [...]    # file paths actually written
    }
}

Report sections

The generated migration_report.md contains:

  1. Header — repo, source ID, timestamps
  2. Summary — pass/fail/skip counts and overall success rate
  3. Risk Assessment — verbatim from the plan
  4. Steps Overview — one-row-per-step table with statuses
  5. Step Details — affected symbols, file changes, validation queries, runtime outcome
  6. Manual Review Required — list of every non-passed step with its reason

CLI

# Analyse a repo and produce a plan
lme analyze --input engine_input.json --output refactor_plan.json

# Execute the plan (writes files, commits changes)
lme execute --input engine_input.json --plan refactor_plan.json

# Generate a Markdown report from a plan
lme report --plan refactor_plan.json --output migration_report.md

Setup

# 1. Clone and install (Python 3.11+)
pip install -e ".[dev]"

# 2. Copy and fill in credentials
cp .env.example .env
# NIA_API_KEY=nk_...
# LLM_API_KEY=sk-...

Running tests

# Part 3 unit + integration tests (no API key needed)
pytest tests/test_integration.py -v -k "not TestLiveNiaAPI"

# Full suite including live Nia API calls
NIA_API_KEY=nk_... pytest tests/test_integration.py -v -s

# All tests
pytest

Test structure

Class / function Requires key What it covers
TestLiveNiaAPI Yes All NiaClient methods against the real Nia API
TestMockNiaClient No All MockNiaClient return types and values
TestReporter No generate() content, stats accuracy, save() file I/O
TestOrchestrator No Orchestrator contract (auto-skipped until orchestrator.py exists)
test_e2e_* No Full fixture → Reporter pipeline; full pipeline when Orchestrator is present

Offline local test

py run_local_test.py

Loads tests/fixtures/sample_plan.json, runs the Orchestrator with MockNiaClient, and saves migration_report.md — no credentials required.


Dependencies

Package Used for
httpx Nia API HTTP client
pydantic All shared data models
typer CLI
python-dotenv .env loading
openai LLM calls (OpenAI provider)
google-genai LLM calls (Gemini provider)
anthropic LLM calls (Anthropic provider)
rich Terminal output formatting
pytest / respx / pytest-mock Testing

About

Hackathon repo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages