An AI-enpowered educational platform that amplifies actual innovation in the latest papers, saving time & accelerate learning with realtime multi-agent consensus validation. No matter you're newly onboard or reknowned sailors, Lighthouse guides YOUR way through the most turbulent, ever-changing water.
Lighthouse assists real-world learning in an automated workflow, ingests PDF papers in managable sessions of summarization, chanllenges you with configurable quizes, which then goes through multi-agent validation. The revision delivers interactive results with retrieval-augmented generation support.
- Hosted demo: https://win7.win/HTE-Hackathon-HKUer/demo/index.html (static page)
This section is written for the three sponsor awards. It’s intentionally explicit about what is already implemented vs. what is planned/extendable.
- MiniMax API usage (implemented): We use MiniMax’s LLM via the Anthropic-compatible endpoint (
ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic) for both generation and evaluation workflows.- Mockpaper generation can run on MiniMax (
MOCKPAPER_PROVIDER=minimax, default modelMiniMax-M2.5) in app/mockpaper.py. - Consensus validation can run MiniMax as a reviewer model (
MiniMax-M2.5) in app/validate.py.
- Mockpaper generation can run on MiniMax (
- Creative angle (implemented): Our “Consensus Engine” is a transparent multi-agent learning experience: two models debate, converge, and produce a final conclusion, with the full communication + conclusion saved into a Markdown report users can keep and audit.
- Multimodal (video/audio/music) tools (planned extension): The current repo focuses on the learning workflow + consensus validation. For this award’s “Must: video/audio/music tools” requirement, the next step is to add an optional Explain-it-like-a-podcast mode:
- Generate a short audio narration of the mockpaper feedback (audio/TTS).
- Generate a 30–60s recap video (video tool) as a “study reel”.
- Generate a low-volume focus track (music tool) attached to a session.
- Reimagined feedback loop (implemented): Instead of “generate once and hope”, we treat feedback as a first-class step. The Validate stage isn’t a single model score — it’s a two-model discussion that proposes fixes and outputs a revised document.
- Supports both strong and struggling learners (implemented):
- Strong learners get faster iteration and higher-quality practice via configurable mockpaper generation.
- Struggling learners benefit from step-by-step validation notes, issue lists, and an auditable consensus conclusion rather than a black-box answer.
- Pedagogy-aligned workflow (implemented): session-scoped sources + snapshots encourage deliberate practice and spaced iteration (you can fork sources, regenerate, revalidate, and compare).
- Content overload → structured pipeline (implemented): Lighthouse turns a pile of PDFs into a curated session library (PDF→Markdown ingest), then generates practice + validation artifacts that are easier to consume than raw papers.
- Timesaving & learning acceleration (implemented): one-click ingest and batch workflows (multi-file upload/ingest) plus streaming job logs reduce “busy work” and waiting time.
- Personalized, future-proof learning path (partially implemented):
- Implemented: session isolation, snapshots, and chat over session docs.
- Next: add true retrieval (embeddings + citations) and continuously updated reading lists for fast-moving domains.
The React UI includes these pages/routes:
/— Dashboard (overview + quick navigation)/upload— Upload files into the current session/ingest/pdf— PDF → Markdown ingest (smart routing)/mockpaper— Generate practice/mock paper from session sources/validate— Validation + consensus-style review with streaming logs/library— Browse session files; view Markdown/PDF; export to PDF/chat— Ask questions over selected session files (RAG)/snapshots— Create/activate/fork source snapshots/jobs— View job history and job status/view/md?path=...— Render a Markdown file/view/pdf?path=...— View a PDF file
Lighthouse is designed to streamline the academic lifecycle of study materials—from raw PDF papers to validated mock examinations. Unlike broad AI tools, ighthouse provides a session-centric workspace where all documents, snapshots, and AI interactions are pinned to a specific context.
The resource to generate the mock paper is in demo/assignment/, which contains a mix of text-heavy and diagram-heavy PDFs. The system intelligently chooses the best ingestion method for each page, resulting in a clean Markdown library that serves as the basis for question generation.
- Ingest: High-fidelity conversion of PDFs to Markdown. It intelligently chooses between fast text extraction and vision-based conversion (using Ark/Doubao models) to handle complex layouts and diagrams.
- Generate: Mock papers are synthesized from your ingested sources. You can control the distribution of Multiple Choice, Short Answer, and Coding questions to match specific exam styles.
- Validate (The Consensus Engine): This is the platform's flagship feature. Instead of a single AI review, it launches a multi-agent discussion. A "Main Model" and "Sub Model" review the generated content, debate potential improvements, and must reach a consensus. The entire "thinking" process is streamed live to the user and finally exported as a revised document.
- Query & Export: Use the built-in streaming chat to ask questions across all session documents. Once satisfied with a document, export it to a clean, formatted PDF via the system-integrated Pandoc pipeline.
- Backend: Async Flask server managing an in-memory job store for background processing.
- AI Routing: Multi-provider support for Ark (Doubao/DeepSeek) and MiniMax.
- Frontend: A responsive React SPA with real-time SSE (Server-Sent Events) for job progress and AI streaming logs.
- Data Consistency: Session-based snapshots allow users to "fork" their source library at any time, ensuring no work is lost during iterative AI generation.
- Session-Scoped Workspace: Multiple isolated sessions with file snapshots and library management.
- Smart PDF Ingest: Multi-file PDF-to-Markdown conversion using Ark (Doubao) vision models or text extraction heuristics.
- Mock Paper Generation: Generate exams based on session documents with customizable style, topic, and difficulty ratios.
- Consensus Validate: A two-model "consensus" workflow where a Main Model and Sub Model discuss and review mock papers, generating revised versions and a streaming "AI thinking" log.
- Streaming Chat: ChatGPT-like interface for querying session documents (using context stuffing) with Markdown and LaTeX rendering.
- Markdown to PDF: View any Markdown source as a formatted PDF using system
pandocwith automatic LaTeX symbol support (e.g., ✓).
app/: Backend logic for PDF ingest, mockpapers, validation, and chat routing.frontui/: React + Tailwind CSS + Vite frontend.main.py: Flask API server and background job runner.data/: Local storage for sessions, source files, and snapshots.
-
Python Dependencies:
python -m venv .venv source .venv/bin/activate pip install -r requirement.txt -
Frontend Dependencies:
cd frontui npm install -
System Dependencies:
pandoc: Required for "View PDF" functionality.xelatex(optional): Recommended for rendering Unicode glyphs in PDFs.
-
API Keys: Create a
.envfile in the root directory:ARK_API_KEY=your_ark_key_here ANTHROPIC_API_KEY=your_minimax_key_here ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic EXA_API_KEY=your_exa_key_here
-
Start the Backend:
python main.py
The server runs on
http://127.0.0.1:8000. -
Start the Frontend:
cd frontui npm run devNavigate to
http://localhost:5173.
- Multi-File Ingest: On the Ingest page, select multiple PDFs to process them in parallel jobs.
- AI Thinking Log: During Validation, use the side panel to view the live streaming "thoughts" of the models as they reach consensus.
- Snapshotting: Use the Snapshots page to save the state of your session's sources before major changes.
