Skip to content

Purestreams/HTE-Hackathon-HKUer

Repository files navigation

Lighthouse Hero

Lighthouse: EduAI

An AI-enpowered educational platform that amplifies actual innovation in the latest papers, saving time & accelerate learning with realtime multi-agent consensus validation. No matter you're newly onboard or reknowned sailors, Lighthouse guides YOUR way through the most turbulent, ever-changing water.

Lighthouse assists real-world learning in an automated workflow, ingests PDF papers in managable sessions of summarization, chanllenges you with configurable quizes, which then goes through multi-agent validation. The revision delivers interactive results with retrieval-augmented generation support.


Special award requirements

This section is written for the three sponsor awards. It’s intentionally explicit about what is already implemented vs. what is planned/extendable.

MiniMax Creative Usage Award (HKD 15,000)

  • MiniMax API usage (implemented): We use MiniMax’s LLM via the Anthropic-compatible endpoint (ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic) for both generation and evaluation workflows.
    • Mockpaper generation can run on MiniMax (MOCKPAPER_PROVIDER=minimax, default model MiniMax-M2.5) in app/mockpaper.py.
    • Consensus validation can run MiniMax as a reviewer model (MiniMax-M2.5) in app/validate.py.
  • Creative angle (implemented): Our “Consensus Engine” is a transparent multi-agent learning experience: two models debate, converge, and produce a final conclusion, with the full communication + conclusion saved into a Markdown report users can keep and audit.
  • Multimodal (video/audio/music) tools (planned extension): The current repo focuses on the learning workflow + consensus validation. For this award’s “Must: video/audio/music tools” requirement, the next step is to add an optional Explain-it-like-a-podcast mode:
    • Generate a short audio narration of the mockpaper feedback (audio/TTS).
    • Generate a 30–60s recap video (video tool) as a “study reel”.
    • Generate a low-volume focus track (music tool) attached to a session.

RevisionDojo Future of Learning Award (HKD 15,000)

  • Reimagined feedback loop (implemented): Instead of “generate once and hope”, we treat feedback as a first-class step. The Validate stage isn’t a single model score — it’s a two-model discussion that proposes fixes and outputs a revised document.
  • Supports both strong and struggling learners (implemented):
    • Strong learners get faster iteration and higher-quality practice via configurable mockpaper generation.
    • Struggling learners benefit from step-by-step validation notes, issue lists, and an auditable consensus conclusion rather than a black-box answer.
  • Pedagogy-aligned workflow (implemented): session-scoped sources + snapshots encourage deliberate practice and spaced iteration (you can fork sources, regenerate, revalidate, and compare).

OAX Foundation AI EdTech Platform Award (HKD 15,000)

  • Content overload → structured pipeline (implemented): Lighthouse turns a pile of PDFs into a curated session library (PDF→Markdown ingest), then generates practice + validation artifacts that are easier to consume than raw papers.
  • Timesaving & learning acceleration (implemented): one-click ingest and batch workflows (multi-file upload/ingest) plus streaming job logs reduce “busy work” and waiting time.
  • Personalized, future-proof learning path (partially implemented):
    • Implemented: session isolation, snapshots, and chat over session docs.
    • Next: add true retrieval (embeddings + citations) and continuously updated reading lists for fast-moving domains.

Frontend Pages

The React UI includes these pages/routes:

  • / — Dashboard (overview + quick navigation)
  • /upload — Upload files into the current session
  • /ingest/pdf — PDF → Markdown ingest (smart routing)
  • /mockpaper — Generate practice/mock paper from session sources
  • /validate — Validation + consensus-style review with streaming logs
  • /library — Browse session files; view Markdown/PDF; export to PDF
  • /chat — Ask questions over selected session files (RAG)
  • /snapshots — Create/activate/fork source snapshots
  • /jobs — View job history and job status
  • /view/md?path=... — Render a Markdown file
  • /view/pdf?path=... — View a PDF file

Detailed Summary

Lighthouse is designed to streamline the academic lifecycle of study materials—from raw PDF papers to validated mock examinations. Unlike broad AI tools, ighthouse provides a session-centric workspace where all documents, snapshots, and AI interactions are pinned to a specific context.

Sample Mock Paper (Automatically Generated)

demo/mockpaper/mockpaper.pdf

The resource to generate the mock paper is in demo/assignment/, which contains a mix of text-heavy and diagram-heavy PDFs. The system intelligently chooses the best ingestion method for each page, resulting in a clean Markdown library that serves as the basis for question generation.

Core Workflow

  1. Ingest: High-fidelity conversion of PDFs to Markdown. It intelligently chooses between fast text extraction and vision-based conversion (using Ark/Doubao models) to handle complex layouts and diagrams.
  2. Generate: Mock papers are synthesized from your ingested sources. You can control the distribution of Multiple Choice, Short Answer, and Coding questions to match specific exam styles.
  3. Validate (The Consensus Engine): This is the platform's flagship feature. Instead of a single AI review, it launches a multi-agent discussion. A "Main Model" and "Sub Model" review the generated content, debate potential improvements, and must reach a consensus. The entire "thinking" process is streamed live to the user and finally exported as a revised document.
  4. Query & Export: Use the built-in streaming chat to ask questions across all session documents. Once satisfied with a document, export it to a clean, formatted PDF via the system-integrated Pandoc pipeline.

Technical Architecture

  • Backend: Async Flask server managing an in-memory job store for background processing.
  • AI Routing: Multi-provider support for Ark (Doubao/DeepSeek) and MiniMax.
  • Frontend: A responsive React SPA with real-time SSE (Server-Sent Events) for job progress and AI streaming logs.
  • Data Consistency: Session-based snapshots allow users to "fork" their source library at any time, ensuring no work is lost during iterative AI generation.

Key Features

  • Session-Scoped Workspace: Multiple isolated sessions with file snapshots and library management.
  • Smart PDF Ingest: Multi-file PDF-to-Markdown conversion using Ark (Doubao) vision models or text extraction heuristics.
  • Mock Paper Generation: Generate exams based on session documents with customizable style, topic, and difficulty ratios.
  • Consensus Validate: A two-model "consensus" workflow where a Main Model and Sub Model discuss and review mock papers, generating revised versions and a streaming "AI thinking" log.
  • Streaming Chat: ChatGPT-like interface for querying session documents (using context stuffing) with Markdown and LaTeX rendering.
  • Markdown to PDF: View any Markdown source as a formatted PDF using system pandoc with automatic LaTeX symbol support (e.g., ✓).

Project Structure

  • app/: Backend logic for PDF ingest, mockpapers, validation, and chat routing.
  • frontui/: React + Tailwind CSS + Vite frontend.
  • main.py: Flask API server and background job runner.
  • data/: Local storage for sessions, source files, and snapshots.

Environment Setup

  1. Python Dependencies:

    python -m venv .venv
    source .venv/bin/activate
    pip install -r requirement.txt
  2. Frontend Dependencies:

    cd frontui
    npm install
  3. System Dependencies:

    • pandoc: Required for "View PDF" functionality.
    • xelatex (optional): Recommended for rendering Unicode glyphs in PDFs.
  4. API Keys: Create a .env file in the root directory:

    ARK_API_KEY=your_ark_key_here
    ANTHROPIC_API_KEY=your_minimax_key_here
    ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic
    EXA_API_KEY=your_exa_key_here

How to Run

  1. Start the Backend:

    python main.py

    The server runs on http://127.0.0.1:8000.

  2. Start the Frontend:

    cd frontui
    npm run dev

    Navigate to http://localhost:5173.

Usage Tips

  • Multi-File Ingest: On the Ingest page, select multiple PDFs to process them in parallel jobs.
  • AI Thinking Log: During Validation, use the side panel to view the live streaming "thoughts" of the models as they reach consensus.
  • Snapshotting: Use the Snapshots page to save the state of your session's sources before major changes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors