Potency AI

Deep Research Agent for Technical & Engineering Intelligence

What is Potency AI

Potency AI is a multi-source AI research agent that retrieves, analyzes, and synthesizes technical information from documentation, academic papers, blog posts, and code repositories in parallel. It produces structured engineering reports complete with architecture diagrams, source credibility scoring, and actionable follow-up questions -- all streamed to the browser in real time via Server-Sent Events.

The system maintains a persistent knowledge graph (SQLite-backed) that grows across research sessions, linking technologies, concepts, patterns, and organizations discovered during each investigation. Every entity is deduplicated, scored for relevance strength, and exportable in JSON, CSV, or Obsidian markdown format.

Potency AI supports five LLM providers (Gemini, OpenAI, Groq, HuggingFace, Ollama) with automatic fallback and hybrid routing based on real-time connectivity monitoring. When the network degrades, classification and extraction tasks shift to a local model while synthesis stays on the cloud. When fully offline, the agent operates entirely on local LLMs with cached sources -- no API keys required.

Key Features

Multi-source parallel retrieval -- documentation, academic papers, blog posts, and code repositories searched concurrently with semantic reranking (BAAI/bge embeddings)
Real-time streaming pipeline -- SSE-powered Agent Brain feed showing intent classification, planning, retrieval, reasoning, and synthesis stages with live progress
8 Mermaid diagram types -- architecture, sequence, flowchart, class, ER, mindmap, timeline, and C4 context diagrams, auto-detected per query with Mermaid v11 compatibility
Persistent knowledge graph -- SQLite-backed with UUID entities, fuzzy deduplication, session tracking, entity merge, backlinks, and Cytoscape.js visualization
Multi-LLM support -- Gemini, OpenAI, Groq, HuggingFace, and Ollama with automatic provider detection and rate-limit fallback chains
Hybrid routing -- cloud, hybrid, or offline mode determined by real-time connectivity monitoring; tasks are routed to cloud or local models based on latency
Web page fetching and analysis -- fetch any URL, extract structured content with trafilatura, analyze key facts/entities/sentiment, and crawl with configurable depth
Redis caching -- query results and source content cached with configurable TTL; automatic file-based fallback when Redis is unavailable
Kafka event streaming -- optional integration for publishing research pipeline events to a Kafka topic
Model comparison mode -- run the same query against two LLMs simultaneously with side-by-side streaming output
User preference memory -- detects "remember I prefer X" in queries and applies saved preferences to future reports automatically
Four reasoning modules -- architecture analysis, tradeoff comparison, performance evaluation, and code quality review, selected automatically by query intent
Export -- Markdown copy, print-to-PDF, and PNG export from the browser

Architecture

                                +---------------------+
                                |     Browser (SPA)   |
                                |  Tailwind + Mermaid |
                                |  + Cytoscape.js     |
                                +----------+----------+
                                           | SSE / REST
                                           v
                               +-----------+-----------+
                               |       FastAPI         |
                               |   Middleware (Auth,   |
                               |    Rate Limiting)     |
                               +-----------+-----------+
                                           |
                  +------------------------+------------------------+
                  |                        |                        |
          +-------v-------+      +--------v--------+      +-------v-------+
          |   Research     |      |   Diagrams      |      | Knowledge     |
          |   Pipeline     |      |   Engine        |      | Graph API     |
          +-------+-------+      +--------+--------+      +-------+-------+
                  |                        |                        |
     +------------+------------+           |                +-------v-------+
     |            |            |           |                |   SQLite      |
+----v---+ +-----v----+ +----v----+       |                |   (aiosqlite) |
| Intent | | Planning | | Reason- |       |                +---------------+
| Class. | |          | |  ing    |       |
+--------+ +----------+ +---------+       |
                  |                        |
          +-------v-------+               |
          |   Retrieval    |               |
          |   Aggregator   |               |
          +-------+-------+               |
                  |                        |
    +------+------+------+------+         |
    |      |      |      |      |         |
  Docs  Papers  Blogs   Code   Web        |
                                          |
          +-------v-------+               |
          |   Synthesis    +<--------------+
          |   Engine       |
          +-------+-------+
                  |
          +-------v-------+
          |   LLM Router   |
          | (Hybrid Cloud/ |
          |   Local)       |
          +-------+-------+
                  |
    +------+------+------+------+
    |      |      |      |      |
 Gemini  OpenAI  Groq   HF   Ollama

Quick Start

Prerequisites

Python 3.11 or higher
At least one of: an LLM API key (Gemini, OpenAI, Groq, HuggingFace) or Ollama installed locally
(Optional) Redis for caching, ChromaDB for vector storage

Installation

# Clone the repository
git clone https://github.com/your-org/potency-ai.git
cd potency-ai

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Configuration

# Copy the example environment file
cp .env.example .env

# Edit .env and add at least one LLM API key
# (or leave all empty to use Ollama locally)

Run

# Start the server
uvicorn app.main:app --reload

# Open http://localhost:8000

Docker Compose

# Start the full stack (app + Redis + ChromaDB)
docker compose up --build

# Access at http://localhost:8000

Configuration

All settings are controlled via environment variables or a .env file. The table below lists the key variables; see .env.example for a fully annotated template.

Variable	Description	Default
`GEMINI_API_KEY`	Google Gemini API key (get one)	`""`
`OPENAI_API_KEY`	OpenAI API key	`""`
`GROQ_API_KEY`	Groq API key (get one)	`""`
`HUGGINGFACE_API_KEY`	HuggingFace API key (get one)	`""`
`ANTHROPIC_API_KEY`	Anthropic API key	`""`
`OLLAMA_BASE_URL`	Ollama server URL	`http://localhost:11434`
`OLLAMA_MODEL`	Default Ollama model	`llama3.1:8b`
`DEFAULT_FAST_MODEL`	Override model for classification/extraction	auto-detected
`DEFAULT_REASONING_MODEL`	Override model for reasoning tasks	auto-detected
`DEFAULT_SYNTHESIS_MODEL`	Override model for report synthesis	auto-detected
`LOCAL_LLM_BACKEND`	Local LLM backend (`ollama`, `llamacpp`, `lmstudio`)	`ollama`
`TAVILY_API_KEY`	Tavily search API key (get one)	`""`
`GITHUB_TOKEN`	GitHub personal access token for code retrieval	`""`
`SEMANTIC_SCHOLAR_API_KEY`	Semantic Scholar API key	`""`
`REDIS_URL`	Redis connection URL	`redis://localhost:6379/0`
`KAFKA_ENABLED`	Enable Kafka event streaming	`false`
`KAFKA_BOOTSTRAP_SERVERS`	Kafka broker addresses	`localhost:9092`
`API_KEY_SECRET`	Optional API key for endpoint authentication	`""`
`LOG_LEVEL`	Logging level	`INFO`
`ENVIRONMENT`	Runtime environment (`development`, `production`, `testing`)	`development`
`MAX_CONCURRENT_RETRIEVALS`	Max parallel retrieval tasks	`5`
`QUICK_MODE_TIMEOUT_SECONDS`	Timeout for quick research mode	`120`
`DEEP_MODE_TIMEOUT_SECONDS`	Timeout for deep research mode	`600`
`RATE_LIMIT_REQUESTS_PER_MINUTE`	API rate limit per client	`30`
`CONNECTIVITY_CHECK_INTERVAL_SECONDS`	Interval between connectivity probes	`60`
`SOURCE_CACHE_TTL_DAYS`	Days before cached sources expire	`7`

API Reference

Research

Method	Endpoint	Description
`POST`	`/research`	Execute a research query and return a structured report
`POST`	`/research/stream`	Execute research with real-time SSE streaming progress
`POST`	`/research/clarify`	Check if a query needs clarification before starting
`POST`	`/research/compare`	Run a query against two LLMs with side-by-side streaming
`POST`	`/research/summarize-source`	Summarize source text using a local BART model
`GET`	`/research/knowledge-graph`	Return the current knowledge graph for visualization
`GET`	`/research/provider`	Return the active LLM provider name
`GET`	`/research/models`	List available HuggingFace models and their status
`GET`	`/research/demo/list`	List available pre-seeded demo queries
`GET`	`/research/demo/{id}`	Load a pre-seeded demo result instantly

Diagrams

Method	Endpoint	Description
`GET`	`/diagrams/types`	List all 8 supported diagram types with metadata
`POST`	`/diagrams/detect-type`	Auto-detect the best diagram type for a query
`POST`	`/diagrams/generate`	Generate a single Mermaid diagram of a given type
`POST`	`/diagrams/generate-all`	Auto-detect types and generate multiple diagrams
`POST`	`/diagrams/regenerate`	Regenerate a diagram with user feedback incorporated

Knowledge Graph

Method	Endpoint	Description
`GET`	`/knowledge-graph`	Full graph with optional category and strength filters
`GET`	`/knowledge-graph/search`	Search entities by name (case-insensitive substring)
`GET`	`/knowledge-graph/timeline`	Entities ordered by discovery date
`GET`	`/knowledge-graph/entity/{id}`	Single entity with backlinks
`PUT`	`/knowledge-graph/entity/{id}`	Update entity notes and tags
`DELETE`	`/knowledge-graph/entity/{id}`	Delete an entity and all its relationships
`GET`	`/knowledge-graph/backlinks/{id}`	Entities that link to a given entity
`POST`	`/knowledge-graph/entity/merge`	Merge two entities into one
`POST`	`/knowledge-graph/export`	Export graph as JSON, CSV, or Obsidian markdown

LLM and Connectivity

Method	Endpoint	Description
`GET`	`/llm/status`	Current connectivity quality, active provider, routing mode
`GET`	`/llm/status/stream`	SSE stream of real-time connectivity changes
`GET`	`/llm/local/models`	List locally available models (Ollama, llama.cpp, LM Studio)
`POST`	`/llm/local/switch`	Switch the active local model at runtime

Web Fetcher

Method	Endpoint	Description
`POST`	`/fetch/url`	Fetch a URL and analyze its content (facts, entities, sentiment)
`POST`	`/fetch/crawl`	Crawl from a URL at a specified depth
`GET`	`/fetch/monitors`	List all monitored URLs
`POST`	`/fetch/monitors`	Add a URL to the change watchlist
`DELETE`	`/fetch/monitors/{id}`	Remove a URL from the watchlist

Memory and Cache

Method	Endpoint	Description
`GET`	`/memory/history/{user_id}`	Retrieve research session history for a user
`GET`	`/memory/knowledge/stats`	Knowledge graph entity and relationship counts
`GET`	`/cache/health`	Redis cache health and connectivity status

Health

Method	Endpoint	Description
`GET`	`/health`	Basic health check
`GET`	`/health/ollama`	Ollama connectivity check and available models
`GET`	`/health/detailed`	Detailed health check with all dependency statuses

Project Structure

potency-ai/
|-- app/
|   |-- main.py                    # FastAPI application entry point and lifespan
|   |-- config.py                  # Pydantic settings, .env loading
|   |-- cli.py                     # CLI entry point (typer)
|   |-- api/
|   |   |-- middleware/
|   |   |   |-- auth.py            # API key authentication middleware
|   |   |   |-- rate_limit.py      # Per-client rate limiting
|   |   |-- routes/
|   |       |-- research.py        # Research pipeline endpoints
|   |       |-- diagrams.py        # Diagram generation endpoints
|   |       |-- knowledge_graph.py # Knowledge graph CRUD and export
|   |       |-- llm.py             # LLM status, model switching, connectivity SSE
|   |       |-- fetch.py           # Web fetcher, crawler, and page monitors
|   |       |-- memory.py          # Session history and knowledge stats
|   |       |-- cache.py           # Cache health endpoint
|   |       |-- health.py          # Health check endpoints
|   |-- core/
|   |   |-- orchestrator.py        # Main 9-stage research pipeline controller
|   |   |-- events.py              # SSE event emitter and pipeline stage enum
|   |   |-- intent.py              # LLM-based query intent classification
|   |   |-- planner.py             # Research plan decomposition into sub-tasks
|   |   |-- pipeline.py            # Data models (ResearchReport, ResearchMode, etc.)
|   |-- llm/
|   |   |-- router.py              # Hybrid LLM router (cloud / local / offline)
|   |   |-- providers.py           # LiteLLM wrapper with multi-provider fallback
|   |   |-- connectivity.py        # Real-time connectivity monitor with adaptive polling
|   |   |-- local_adapter.py       # Ollama, llama.cpp, and LM Studio adapter
|   |   |-- hf_models.py           # HuggingFace model catalog and local inference
|   |   |-- prompts.py             # All prompt templates (10+)
|   |-- retrieval/
|   |   |-- aggregator.py          # Multi-source parallel retrieval orchestrator
|   |   |-- documentation.py       # Documentation retriever (Tavily)
|   |   |-- papers.py              # Academic paper retriever (Semantic Scholar)
|   |   |-- blogs.py               # Blog and article retriever
|   |   |-- code.py                # Code repository retriever (GitHub)
|   |   |-- web.py                 # Web search retriever
|   |   |-- web_fetcher.py         # URL fetcher and multi-page crawler
|   |   |-- reranker.py            # Semantic reranking (sentence-transformers, BAAI/bge)
|   |   |-- cache.py               # Source caching for offline use
|   |   |-- monitor.py             # URL change detection and monitoring
|   |-- reasoning/
|   |   |-- architecture.py        # Architecture pattern analysis module
|   |   |-- tradeoff.py            # Technology tradeoff comparison module
|   |   |-- performance.py         # Performance and benchmark evaluation module
|   |   |-- code_quality.py        # Code quality review module
|   |   |-- base.py                # Base reasoning module interface
|   |-- synthesis/
|   |   |-- engine.py              # Report generation (streaming and batch)
|   |   |-- templates.py           # Report section templates
|   |   |-- export.py              # Export utilities
|   |-- diagrams/
|   |   |-- engine.py              # Mermaid generation, validation, auto-fix, retry
|   |   |-- types.py               # 8 diagram type specs and auto-detection logic
|   |-- memory/
|   |   |-- knowledge_graph.py     # SQLite-backed knowledge graph with dedup
|   |   |-- manager.py             # Memory manager (KG + sessions + context)
|   |   |-- session.py             # Session history tracking
|   |   |-- user_prefs.py          # User preference storage
|   |-- analysis/
|   |   |-- page_analyzer.py       # Web page content analysis (facts, entities)
|   |-- cache/
|   |   |-- redis_client.py        # Redis cache client with file-based fallback
|   |-- events/
|   |   |-- kafka_producer.py      # Kafka event producer (optional)
|   |-- utils/
|       |-- errors.py              # Custom exception hierarchy
|       |-- logging.py             # Structured logging (structlog)
|       |-- tokens.py              # Token usage tracking and cost calculation
|-- static/
|   |-- index.html                 # Single-page application shell
|   |-- css/style.css              # Tailwind-based dark glass UI
|   |-- js/
|       |-- app.js                 # Main app logic, SSE handling, demo mode
|       |-- research.js            # Pipeline visualization and source cards
|       |-- knowledge.js           # Cytoscape.js knowledge graph visualization
|       |-- diagrams.js            # Mermaid diagram rendering and export
|       |-- compare.js             # Side-by-side model comparison UI
|       |-- charts.js              # Chart utilities
|-- tests/
|   |-- unit/                      # Unit tests (20+ test files)
|   |-- integration/               # Integration tests
|   |-- conftest.py                # Shared test fixtures
|-- data/
|   |-- knowledge_graph.db         # SQLite knowledge graph database
|   |-- demo/                      # Pre-seeded demo query results
|-- scripts/
|   |-- seed_data.py               # Database seeding script
|   |-- setup_db.py                # Database setup script
|-- .env.example                   # Annotated environment configuration template
|-- requirements.txt               # Python dependencies
|-- pyproject.toml                 # Project metadata, tool config, CLI entry point
|-- Dockerfile                     # Container image (Python 3.11-slim)
|-- docker-compose.yml             # Full stack: app + Redis + ChromaDB

Tech Stack

Layer	Technology	Purpose
Backend	FastAPI + Uvicorn	Async web framework with auto-generated OpenAPI docs
LLM Routing	LiteLLM	Unified interface to Gemini, OpenAI, Groq, HuggingFace, Ollama
Streaming	Server-Sent Events (SSE)	Real-time pipeline progress and token streaming
Knowledge Graph	SQLite via aiosqlite	Persistent entity/relationship storage with session tracking
Vector Search	ChromaDB	Embedding-based retrieval (optional)
Cache	Redis (hiredis)	Query and source caching with configurable TTL
Event Bus	Apache Kafka via aiokafka	Optional pipeline event streaming
Semantic Reranking	sentence-transformers (BAAI/bge)	Local cross-encoder reranking of retrieved sources
Web Retrieval	Tavily, Semantic Scholar, GitHub API	Multi-source parallel document search
Content Extraction	trafilatura, BeautifulSoup4	Clean text extraction from web pages
Diagrams	Mermaid.js v11	8 diagram types rendered client-side
Graph Visualization	Cytoscape.js	Interactive knowledge graph in the browser
Frontend	Vanilla JS + Tailwind CSS	Single-page application with no build step
Validation	Pydantic v2 + pydantic-settings	Request/response validation and .env configuration
Logging	structlog	Structured JSON logging
Metrics	prometheus-client	Prometheus-compatible metrics export
Testing	pytest + pytest-asyncio	Async-first test suite with respx for HTTP mocking
Linting	Ruff	Fast Python linter and formatter
Type Checking	mypy (strict mode)	Static type analysis
Containerization	Docker + Docker Compose	Reproducible multi-service deployments

License

This project is licensed under the MIT License. See pyproject.toml for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Potency AI

What is Potency AI

Key Features

Architecture

Quick Start

Prerequisites

Installation

Configuration

Run

Docker Compose

Configuration

API Reference

Research

Diagrams

Knowledge Graph

LLM and Connectivity

Web Fetcher

Memory and Cache

Health

Project Structure

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
app		app
data		data
docs		docs
hackathon-plan		hackathon-plan
scripts		scripts
static		static
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Potency AI

What is Potency AI

Key Features

Architecture

Quick Start

Prerequisites

Installation

Configuration

Run

Docker Compose

Configuration

API Reference

Research

Diagrams

Knowledge Graph

LLM and Connectivity

Web Fetcher

Memory and Cache

Health

Project Structure

Tech Stack

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages