Skip to content

ayushap18/AgentX

Repository files navigation

Potency AI

Deep Research Agent for Technical & Engineering Intelligence

Python 3.11+ FastAPI License: MIT


What is Potency AI

Potency AI is a multi-source AI research agent that retrieves, analyzes, and synthesizes technical information from documentation, academic papers, blog posts, and code repositories in parallel. It produces structured engineering reports complete with architecture diagrams, source credibility scoring, and actionable follow-up questions -- all streamed to the browser in real time via Server-Sent Events.

The system maintains a persistent knowledge graph (SQLite-backed) that grows across research sessions, linking technologies, concepts, patterns, and organizations discovered during each investigation. Every entity is deduplicated, scored for relevance strength, and exportable in JSON, CSV, or Obsidian markdown format.

Potency AI supports five LLM providers (Gemini, OpenAI, Groq, HuggingFace, Ollama) with automatic fallback and hybrid routing based on real-time connectivity monitoring. When the network degrades, classification and extraction tasks shift to a local model while synthesis stays on the cloud. When fully offline, the agent operates entirely on local LLMs with cached sources -- no API keys required.


Key Features

  • Multi-source parallel retrieval -- documentation, academic papers, blog posts, and code repositories searched concurrently with semantic reranking (BAAI/bge embeddings)
  • Real-time streaming pipeline -- SSE-powered Agent Brain feed showing intent classification, planning, retrieval, reasoning, and synthesis stages with live progress
  • 8 Mermaid diagram types -- architecture, sequence, flowchart, class, ER, mindmap, timeline, and C4 context diagrams, auto-detected per query with Mermaid v11 compatibility
  • Persistent knowledge graph -- SQLite-backed with UUID entities, fuzzy deduplication, session tracking, entity merge, backlinks, and Cytoscape.js visualization
  • Multi-LLM support -- Gemini, OpenAI, Groq, HuggingFace, and Ollama with automatic provider detection and rate-limit fallback chains
  • Hybrid routing -- cloud, hybrid, or offline mode determined by real-time connectivity monitoring; tasks are routed to cloud or local models based on latency
  • Web page fetching and analysis -- fetch any URL, extract structured content with trafilatura, analyze key facts/entities/sentiment, and crawl with configurable depth
  • Redis caching -- query results and source content cached with configurable TTL; automatic file-based fallback when Redis is unavailable
  • Kafka event streaming -- optional integration for publishing research pipeline events to a Kafka topic
  • Model comparison mode -- run the same query against two LLMs simultaneously with side-by-side streaming output
  • User preference memory -- detects "remember I prefer X" in queries and applies saved preferences to future reports automatically
  • Four reasoning modules -- architecture analysis, tradeoff comparison, performance evaluation, and code quality review, selected automatically by query intent
  • Export -- Markdown copy, print-to-PDF, and PNG export from the browser

Architecture

                                +---------------------+
                                |     Browser (SPA)   |
                                |  Tailwind + Mermaid |
                                |  + Cytoscape.js     |
                                +----------+----------+
                                           | SSE / REST
                                           v
                               +-----------+-----------+
                               |       FastAPI         |
                               |   Middleware (Auth,   |
                               |    Rate Limiting)     |
                               +-----------+-----------+
                                           |
                  +------------------------+------------------------+
                  |                        |                        |
          +-------v-------+      +--------v--------+      +-------v-------+
          |   Research     |      |   Diagrams      |      | Knowledge     |
          |   Pipeline     |      |   Engine        |      | Graph API     |
          +-------+-------+      +--------+--------+      +-------+-------+
                  |                        |                        |
     +------------+------------+           |                +-------v-------+
     |            |            |           |                |   SQLite      |
+----v---+ +-----v----+ +----v----+       |                |   (aiosqlite) |
| Intent | | Planning | | Reason- |       |                +---------------+
| Class. | |          | |  ing    |       |
+--------+ +----------+ +---------+       |
                  |                        |
          +-------v-------+               |
          |   Retrieval    |               |
          |   Aggregator   |               |
          +-------+-------+               |
                  |                        |
    +------+------+------+------+         |
    |      |      |      |      |         |
  Docs  Papers  Blogs   Code   Web        |
                                          |
          +-------v-------+               |
          |   Synthesis    +<--------------+
          |   Engine       |
          +-------+-------+
                  |
          +-------v-------+
          |   LLM Router   |
          | (Hybrid Cloud/ |
          |   Local)       |
          +-------+-------+
                  |
    +------+------+------+------+
    |      |      |      |      |
 Gemini  OpenAI  Groq   HF   Ollama

Quick Start

Prerequisites

  • Python 3.11 or higher
  • At least one of: an LLM API key (Gemini, OpenAI, Groq, HuggingFace) or Ollama installed locally
  • (Optional) Redis for caching, ChromaDB for vector storage

Installation

# Clone the repository
git clone https://github.com/your-org/potency-ai.git
cd potency-ai

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Configuration

# Copy the example environment file
cp .env.example .env

# Edit .env and add at least one LLM API key
# (or leave all empty to use Ollama locally)

Run

# Start the server
uvicorn app.main:app --reload

# Open http://localhost:8000

Docker Compose

# Start the full stack (app + Redis + ChromaDB)
docker compose up --build

# Access at http://localhost:8000

Configuration

All settings are controlled via environment variables or a .env file. The table below lists the key variables; see .env.example for a fully annotated template.

Variable Description Default
GEMINI_API_KEY Google Gemini API key (get one) ""
OPENAI_API_KEY OpenAI API key ""
GROQ_API_KEY Groq API key (get one) ""
HUGGINGFACE_API_KEY HuggingFace API key (get one) ""
ANTHROPIC_API_KEY Anthropic API key ""
OLLAMA_BASE_URL Ollama server URL http://localhost:11434
OLLAMA_MODEL Default Ollama model llama3.1:8b
DEFAULT_FAST_MODEL Override model for classification/extraction auto-detected
DEFAULT_REASONING_MODEL Override model for reasoning tasks auto-detected
DEFAULT_SYNTHESIS_MODEL Override model for report synthesis auto-detected
LOCAL_LLM_BACKEND Local LLM backend (ollama, llamacpp, lmstudio) ollama
TAVILY_API_KEY Tavily search API key (get one) ""
GITHUB_TOKEN GitHub personal access token for code retrieval ""
SEMANTIC_SCHOLAR_API_KEY Semantic Scholar API key ""
REDIS_URL Redis connection URL redis://localhost:6379/0
KAFKA_ENABLED Enable Kafka event streaming false
KAFKA_BOOTSTRAP_SERVERS Kafka broker addresses localhost:9092
API_KEY_SECRET Optional API key for endpoint authentication ""
LOG_LEVEL Logging level INFO
ENVIRONMENT Runtime environment (development, production, testing) development
MAX_CONCURRENT_RETRIEVALS Max parallel retrieval tasks 5
QUICK_MODE_TIMEOUT_SECONDS Timeout for quick research mode 120
DEEP_MODE_TIMEOUT_SECONDS Timeout for deep research mode 600
RATE_LIMIT_REQUESTS_PER_MINUTE API rate limit per client 30
CONNECTIVITY_CHECK_INTERVAL_SECONDS Interval between connectivity probes 60
SOURCE_CACHE_TTL_DAYS Days before cached sources expire 7

API Reference

Research

Method Endpoint Description
POST /research Execute a research query and return a structured report
POST /research/stream Execute research with real-time SSE streaming progress
POST /research/clarify Check if a query needs clarification before starting
POST /research/compare Run a query against two LLMs with side-by-side streaming
POST /research/summarize-source Summarize source text using a local BART model
GET /research/knowledge-graph Return the current knowledge graph for visualization
GET /research/provider Return the active LLM provider name
GET /research/models List available HuggingFace models and their status
GET /research/demo/list List available pre-seeded demo queries
GET /research/demo/{id} Load a pre-seeded demo result instantly

Diagrams

Method Endpoint Description
GET /diagrams/types List all 8 supported diagram types with metadata
POST /diagrams/detect-type Auto-detect the best diagram type for a query
POST /diagrams/generate Generate a single Mermaid diagram of a given type
POST /diagrams/generate-all Auto-detect types and generate multiple diagrams
POST /diagrams/regenerate Regenerate a diagram with user feedback incorporated

Knowledge Graph

Method Endpoint Description
GET /knowledge-graph Full graph with optional category and strength filters
GET /knowledge-graph/search Search entities by name (case-insensitive substring)
GET /knowledge-graph/timeline Entities ordered by discovery date
GET /knowledge-graph/entity/{id} Single entity with backlinks
PUT /knowledge-graph/entity/{id} Update entity notes and tags
DELETE /knowledge-graph/entity/{id} Delete an entity and all its relationships
GET /knowledge-graph/backlinks/{id} Entities that link to a given entity
POST /knowledge-graph/entity/merge Merge two entities into one
POST /knowledge-graph/export Export graph as JSON, CSV, or Obsidian markdown

LLM and Connectivity

Method Endpoint Description
GET /llm/status Current connectivity quality, active provider, routing mode
GET /llm/status/stream SSE stream of real-time connectivity changes
GET /llm/local/models List locally available models (Ollama, llama.cpp, LM Studio)
POST /llm/local/switch Switch the active local model at runtime

Web Fetcher

Method Endpoint Description
POST /fetch/url Fetch a URL and analyze its content (facts, entities, sentiment)
POST /fetch/crawl Crawl from a URL at a specified depth
GET /fetch/monitors List all monitored URLs
POST /fetch/monitors Add a URL to the change watchlist
DELETE /fetch/monitors/{id} Remove a URL from the watchlist

Memory and Cache

Method Endpoint Description
GET /memory/history/{user_id} Retrieve research session history for a user
GET /memory/knowledge/stats Knowledge graph entity and relationship counts
GET /cache/health Redis cache health and connectivity status

Health

Method Endpoint Description
GET /health Basic health check
GET /health/ollama Ollama connectivity check and available models
GET /health/detailed Detailed health check with all dependency statuses

Project Structure

potency-ai/
|-- app/
|   |-- main.py                    # FastAPI application entry point and lifespan
|   |-- config.py                  # Pydantic settings, .env loading
|   |-- cli.py                     # CLI entry point (typer)
|   |-- api/
|   |   |-- middleware/
|   |   |   |-- auth.py            # API key authentication middleware
|   |   |   |-- rate_limit.py      # Per-client rate limiting
|   |   |-- routes/
|   |       |-- research.py        # Research pipeline endpoints
|   |       |-- diagrams.py        # Diagram generation endpoints
|   |       |-- knowledge_graph.py # Knowledge graph CRUD and export
|   |       |-- llm.py             # LLM status, model switching, connectivity SSE
|   |       |-- fetch.py           # Web fetcher, crawler, and page monitors
|   |       |-- memory.py          # Session history and knowledge stats
|   |       |-- cache.py           # Cache health endpoint
|   |       |-- health.py          # Health check endpoints
|   |-- core/
|   |   |-- orchestrator.py        # Main 9-stage research pipeline controller
|   |   |-- events.py              # SSE event emitter and pipeline stage enum
|   |   |-- intent.py              # LLM-based query intent classification
|   |   |-- planner.py             # Research plan decomposition into sub-tasks
|   |   |-- pipeline.py            # Data models (ResearchReport, ResearchMode, etc.)
|   |-- llm/
|   |   |-- router.py              # Hybrid LLM router (cloud / local / offline)
|   |   |-- providers.py           # LiteLLM wrapper with multi-provider fallback
|   |   |-- connectivity.py        # Real-time connectivity monitor with adaptive polling
|   |   |-- local_adapter.py       # Ollama, llama.cpp, and LM Studio adapter
|   |   |-- hf_models.py           # HuggingFace model catalog and local inference
|   |   |-- prompts.py             # All prompt templates (10+)
|   |-- retrieval/
|   |   |-- aggregator.py          # Multi-source parallel retrieval orchestrator
|   |   |-- documentation.py       # Documentation retriever (Tavily)
|   |   |-- papers.py              # Academic paper retriever (Semantic Scholar)
|   |   |-- blogs.py               # Blog and article retriever
|   |   |-- code.py                # Code repository retriever (GitHub)
|   |   |-- web.py                 # Web search retriever
|   |   |-- web_fetcher.py         # URL fetcher and multi-page crawler
|   |   |-- reranker.py            # Semantic reranking (sentence-transformers, BAAI/bge)
|   |   |-- cache.py               # Source caching for offline use
|   |   |-- monitor.py             # URL change detection and monitoring
|   |-- reasoning/
|   |   |-- architecture.py        # Architecture pattern analysis module
|   |   |-- tradeoff.py            # Technology tradeoff comparison module
|   |   |-- performance.py         # Performance and benchmark evaluation module
|   |   |-- code_quality.py        # Code quality review module
|   |   |-- base.py                # Base reasoning module interface
|   |-- synthesis/
|   |   |-- engine.py              # Report generation (streaming and batch)
|   |   |-- templates.py           # Report section templates
|   |   |-- export.py              # Export utilities
|   |-- diagrams/
|   |   |-- engine.py              # Mermaid generation, validation, auto-fix, retry
|   |   |-- types.py               # 8 diagram type specs and auto-detection logic
|   |-- memory/
|   |   |-- knowledge_graph.py     # SQLite-backed knowledge graph with dedup
|   |   |-- manager.py             # Memory manager (KG + sessions + context)
|   |   |-- session.py             # Session history tracking
|   |   |-- user_prefs.py          # User preference storage
|   |-- analysis/
|   |   |-- page_analyzer.py       # Web page content analysis (facts, entities)
|   |-- cache/
|   |   |-- redis_client.py        # Redis cache client with file-based fallback
|   |-- events/
|   |   |-- kafka_producer.py      # Kafka event producer (optional)
|   |-- utils/
|       |-- errors.py              # Custom exception hierarchy
|       |-- logging.py             # Structured logging (structlog)
|       |-- tokens.py              # Token usage tracking and cost calculation
|-- static/
|   |-- index.html                 # Single-page application shell
|   |-- css/style.css              # Tailwind-based dark glass UI
|   |-- js/
|       |-- app.js                 # Main app logic, SSE handling, demo mode
|       |-- research.js            # Pipeline visualization and source cards
|       |-- knowledge.js           # Cytoscape.js knowledge graph visualization
|       |-- diagrams.js            # Mermaid diagram rendering and export
|       |-- compare.js             # Side-by-side model comparison UI
|       |-- charts.js              # Chart utilities
|-- tests/
|   |-- unit/                      # Unit tests (20+ test files)
|   |-- integration/               # Integration tests
|   |-- conftest.py                # Shared test fixtures
|-- data/
|   |-- knowledge_graph.db         # SQLite knowledge graph database
|   |-- demo/                      # Pre-seeded demo query results
|-- scripts/
|   |-- seed_data.py               # Database seeding script
|   |-- setup_db.py                # Database setup script
|-- .env.example                   # Annotated environment configuration template
|-- requirements.txt               # Python dependencies
|-- pyproject.toml                 # Project metadata, tool config, CLI entry point
|-- Dockerfile                     # Container image (Python 3.11-slim)
|-- docker-compose.yml             # Full stack: app + Redis + ChromaDB

Tech Stack

Layer Technology Purpose
Backend FastAPI + Uvicorn Async web framework with auto-generated OpenAPI docs
LLM Routing LiteLLM Unified interface to Gemini, OpenAI, Groq, HuggingFace, Ollama
Streaming Server-Sent Events (SSE) Real-time pipeline progress and token streaming
Knowledge Graph SQLite via aiosqlite Persistent entity/relationship storage with session tracking
Vector Search ChromaDB Embedding-based retrieval (optional)
Cache Redis (hiredis) Query and source caching with configurable TTL
Event Bus Apache Kafka via aiokafka Optional pipeline event streaming
Semantic Reranking sentence-transformers (BAAI/bge) Local cross-encoder reranking of retrieved sources
Web Retrieval Tavily, Semantic Scholar, GitHub API Multi-source parallel document search
Content Extraction trafilatura, BeautifulSoup4 Clean text extraction from web pages
Diagrams Mermaid.js v11 8 diagram types rendered client-side
Graph Visualization Cytoscape.js Interactive knowledge graph in the browser
Frontend Vanilla JS + Tailwind CSS Single-page application with no build step
Validation Pydantic v2 + pydantic-settings Request/response validation and .env configuration
Logging structlog Structured JSON logging
Metrics prometheus-client Prometheus-compatible metrics export
Testing pytest + pytest-asyncio Async-first test suite with respx for HTTP mocking
Linting Ruff Fast Python linter and formatter
Type Checking mypy (strict mode) Static type analysis
Containerization Docker + Docker Compose Reproducible multi-service deployments

License

This project is licensed under the MIT License. See pyproject.toml for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors