English | 简体中文
AI Research Agent — Academic Knowledge Graph Engine powered by LLM
Upload research PDFs → LLM extracts hierarchical concepts →
build an interactive knowledge graph → discover research opportunities via AI Agents
Quick Start • Features • AI Agents • Architecture • Roadmap
| Feature | Description |
|---|---|
| 📄 PDF Parsing | Auto-extract title, authors, abstract from research papers (MarkItDown, no Java dependency) |
| 🌍 Auto-Translation | LLM-powered bilingual concept names (ZH/EN) for cross-language search |
| 🧠 Two-Stage Concept Extraction | Stage 1: Paper understanding → Stage 2: Hierarchical concept extraction with 8 categories |
| 🌐 Semantic Scholar Integration | Auto-enhance paper metadata (DOI, citations, venue, citation count) |
| 📊 Interactive Graph Visualization | Force-directed graph with category-based node sizes, search & filter |
| 🔍 Research Point Discovery | 4 methodologies: Gap Filling, Leaf Extension, Bottleneck, Transfer |
| 🏷️ Research Point Badges | Difficulty, novelty, and impact ratings with color-coded badges |
| 📤 Multi-format Export | HTML (interactive D3), Obsidian Canvas, Markdown |
| 📁 Folder Management | Organize papers into folders with sidebar navigation |
| ⚡ Queue Processing | Sequential batch processing with time estimation |
| 🔄 Smart Deduplication | Synonym merging, absorption, translation detection |
| 🤖 AI Research Agents | Chat-based agents for paper Q&A, citation analysis, deep research |
Drag nodes, zoom, search concepts, filter by category
Click concept → Discover research points → View analysis context
Upload PDFs → Process → Explore graph → Export
Configure API Key → Test connection → Start processing
docker pull danceinsophy/meta-knowledge-graph:latest
docker run -d -p 8089:8089 \
-v mkg-data:/app/data \
-v mkg-papers:/app/papers \
--restart unless-stopped \
danceinsophy/meta-knowledge-graph:latestOpen http://localhost:8089 — configure your LLM API key in the Settings page.
API Keys are saved locally in the database. Supports Claude, OpenAI, Gemini, Qwen, DeepSeek, and more.
git clone https://github.com/Seaual/meta-knowledge-graph.git
cd meta-knowledge-graph/docker
docker-compose up -d# Clone
git clone https://github.com/Seaual/meta-knowledge-graph.git
cd meta-knowledge-graph
# Backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
python -m uvicorn backend.main:app --host 0.0.0.0 --port 8089 --reload
# Frontend (in another terminal)
cd frontend && npm install && npm run devOpen http://localhost:5173 for the full dev experience with hot reload.
MKG includes a multi-agent system built on LangGraph for intelligent research assistance:
Routes your question to the appropriate specialist agent:
- Concept Search — find concepts in the knowledge graph
- Paper Search — find papers by title or concept
- Recommendation — recommend relevant papers
Answer detailed questions about specific papers:
- Fetches paper metadata from the database
- Reads full paper content when needed
- Provides accurate answers sourced from the paper
Analyzes paper citation relationships:
- Citation statistics and trends
- Key citing papers and their impact
- Citation network within your collection
Deep analysis of concepts and research opportunities:
- Retrieves concept graph structure (parent/child concepts)
- Analyzes research gaps using 4 methodologies
- Recommends frontier papers from Semantic Scholar
Multi-dimensional research synthesis running asynchronously:
- Spawns specialized research agents per dimension
- Synthesizes findings into a comprehensive report
- Progress tracking via session ID
Automatically condenses long agent outputs into concise summaries.
┌─────────────────────────────────────────────────┐
│ Frontend │
│ React + TypeScript + D3.js │
└─────────────────────┬───────────────────────────┘
│ REST API
┌─────────────────────▼───────────────────────────┐
│ Backend │
│ FastAPI + SQLite + LangGraph Agents │
└─────────────────────┬───────────────────────────┘
│ LLM API / S2 API
┌─────────────────────▼───────────────────────────┐
│ External Services │
│ LLM: Claude/Gemini/Qwen S2: Metadata API │
└─────────────────────────────────────────────────┘
Data Flow: PDF → S2 Enhancement → LLM Extract (Two-Stage) → Knowledge Graph → Agent Analysis
| Category | Description | Example | Node Size |
|---|---|---|---|
| field | Major domain | Artificial Intelligence | Largest |
| direction | Research direction | Multi-Agent RL | Large |
| subdirection | Sub-direction | Value Decomposition | Medium |
| task | Research task | Credit Assignment | Small |
| method | Algorithm | QMIX | Smaller |
| technique | Technical detail | Attention-weighted mixing | Smallest |
| dataset | Benchmark/Dataset | ImageNet, SMAC | Medium |
| finding | Key discovery | Scaling Laws | Medium |
| Method | Description |
|---|---|
| 🔍 Gap Filling | Missing connections between related branches |
| 🌱 Leaf Extension | Leaf nodes applied to other branches |
| 🔥 Bottleneck | Node with many children but few siblings |
| 🔄 Transfer | Mature methods transferred to unsolved problems |
- Go to Papers page → Upload PDF files (batch supported)
- Papers appear in Pending list with auto-enhanced metadata from Semantic Scholar
- Click Process or Batch Process
- LLM extracts concept trees with bilingual names (EN/ZH)
- Concepts are merged into the knowledge graph
- Go to Concepts page → drag nodes, scroll to zoom
- Search concepts by name, filter by category
- Click any concept for details
- Click a concept → Discover Research Points
- LLM analyzes graph structure, generates 3-5 research directions
- Go to Chat page → ask questions about your papers or concepts
- Agents automatically route to the right specialist and return structured results with interactive cards
- Click Dedup Scan → review merge suggestions → execute selected merges
- HTML — standalone interactive D3.js graph
- Canvas — Obsidian Canvas format
- Markdown — double-link format for notes
| Provider | Type | Configuration |
|---|---|---|
| Anthropic Claude | Native API | ANTHROPIC_API_KEY |
| Google Gemini | Native API | GOOGLE_API_KEY |
| OpenAI | OpenAI Compatible | OPENAI_API_KEY |
| Alibaba DashScope | OpenAI Compatible | DASHSCOPE_API_KEY |
| Qwen | OpenAI Compatible | Custom base_url |
| DeepSeek | OpenAI Compatible | Custom base_url |
| OpenRouter | OpenAI Compatible | OPENAI_API_KEY + base_url |
| MiniMax | OpenAI Compatible | Custom base_url |
Backend: Python 3.10+ • FastAPI • SQLite • MarkItDown • LangGraph
Frontend: React 18 • TypeScript • Vite • TailwindCSS • D3.js • i18n
LLM: Claude / Gemini / Qwen / DeepSeek / OpenRouter / OpenAI
External APIs: Semantic Scholar (paper metadata enhancement)
meta-knowledge-graph/
├── backend/ # FastAPI backend
│ ├── main.py # App entry, CORS, router registration
│ ├── routes/ # API route handlers
│ ├── services/ # Business logic services
│ ├── schemas.py # Pydantic models
│ └── dependencies.py # DI providers
├── frontend/ # React + TypeScript frontend
│ └── src/
│ ├── pages/ # Page components
│ ├── components/ # Shared components + cards
│ ├── i18n/ # Chinese/English translations
│ ├── lib/api/ # API client modules
│ └── store/ # Zustand state management
├── mkg/ # Core library
│ ├── database.py # SQLite database manager
│ ├── repositories/ # Data access layer
│ ├── agent/ # LangGraph agent system
│ │ ├── nodes/ # Agent nodes (lead, research, citation, etc.)
│ │ ├── tools.py # Tool definitions
│ │ └── research_graph.py # Deep research orchestration
│ ├── dedup/ # Concept deduplication module
│ ├── semantic_scholar.py # S2 API client
│ └── llm.py # LLM provider abstraction
├── scripts/ # Utility scripts (demo data generation)
├── docker/ # Docker configuration
├── icon/ # Project icons
├── docs/ # Demo screenshots and gifs
└── Dockerfile # Multi-stage Docker build
Access http://localhost:8089/docs after starting the backend.
| Endpoint | Method | Description |
|---|---|---|
/api/papers/upload |
POST | Upload PDF file |
/api/papers/batch-upload |
POST | Batch upload PDFs |
/api/papers/batch-process |
POST | Batch process papers |
/api/concepts/ |
GET | Get all concepts |
/api/concepts/{id}/research-points |
GET | Discover research points |
/api/concepts/{id}/search-papers |
GET | Search papers by concept |
/api/concepts/dedup/scan |
POST | Scan for duplicates |
/api/graph/export/obsidian/html |
GET | Export interactive HTML |
/api/agent/chat |
POST | Chat with AI agents |
/api/agent/deep-research/start |
POST | Start deep research session |
/api/agent/deep-research/{id}/status |
GET | Check research progress |
- Agent session isolation: chat requests now pass a
conversationIdthrough the frontend and backend so LangGraph checkpoints no longer share one global thread. - Faster concept persistence: concept trees are collected and written in a single database transaction, which reduces commit overhead during paper processing.
- More stable external calls: LLM-backed research point discovery, concept translation, and shared text generation now use retry-aware wrappers with structured logging.
- Graph interaction performance: the Concepts page keeps one ForceGraph instance alive and refreshes data incrementally instead of destroying and rebuilding the graph on common UI state changes.
- Initial regression coverage: added resilience-focused tests for retry behavior and LLM wrapper handling.
See CHANGELOG.md for the release summary.
- Two-stage concept extraction
- Research point discovery (4 methodologies)
- Academic light theme UI
- Bilingual support (Chinese/English)
- Semantic Scholar metadata enhancement
- Graph search and filter
- Concept deduplication
- Multi-format export
- Batch processing
- Multiple LLM backends
- AI Research Agents (Chat, Paper Q&A, Citation Analysis, Research)
- Deep Research with async progress tracking
- CI/CD (GitHub Actions - lint, type-check, test)
- Auto-translation for Chinese concept names (LLM-powered)
- Research points difficulty/novelty/impact badges
- MarkItDown PDF parsing (no Java dependency)
- Real-time collaboration
- Neo4j support
Issues and Pull Requests are welcome!
MIT License
Made with ❤️ by Seaual



