Advanced AI-powered research system using diffusion-based iterative refinement
TTD-DR (Test-Time Diffusion Deep Researcher) is an innovative research system that applies diffusion-based algorithms to generate comprehensive, high-quality research reports. Unlike traditional retrieval-augmented generation (RAG) systems, TTD-DR uses a draft-centric approach where an evolving draft dynamically guides the research process through multiple iterations.
- 🔄 Iterative Draft Refinement: Starts with a "noisy" initial draft and progressively refines it
- 🎯 Draft-Centric Search: The evolving draft guides what information to search for next
- 🔍 Multi-Engine Search: Integrates Tavily, DuckDuckGo, and Naver search engines
- 🧠 Gap Analysis: Automatically identifies knowledge gaps and fills them systematically
- ⚖️ Quality Evaluation: Continuous assessment of research completeness and quality
- 🌐 Multi-Language Support: Works with English, Korean, and other languages
- 🚀 Async Support: Built with modern async/await patterns for optimal performance
The system implements the Denoising with Retrieval (Draft-Centric Approach) algorithm:
- Initialize: Generate a noisy initial draft R₀
- Analyze: Identify gaps in the current draft
- Search: Query multiple search engines to fill identified gaps
- Denoise: Update the draft with new information
- Evaluate: Assess quality and determine if more iterations are needed
- Iterate: Repeat until quality threshold is met or max iterations reached
# Clone the repository
git clone <repository-url>
cd ttd-dr
# Install dependencies
pip install -r requirements.txtCopy the example environment file and configure your API keys:
cp .env.example .envEdit .env with your API keys:
# Required: Choose one
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-azure-openai-api-key
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
# OR
OPENAI_API_KEY=sk-your-openai-api-key-here
# Optional but recommended
TAVILY_API_KEY=tvly-your-tavily-api-key
NAVER_CLIENT_ID=your-naver-client-id
NAVER_CLIENT_SECRET=your-naver-client-secretThe easiest way to get started is with our simple chatbot interface:
python chatbot.pyExample interaction:
🤖 TTD-DR Deep Research Chatbot
============================================================
Welcome to the Test-Time Diffusion Deep Researcher!
This chatbot can conduct in-depth research on any topic.
🤔 Your research question: What are the latest developments in artificial intelligence in 2024?
🔍 Research Query: What are the latest developments in artificial intelligence in 2024?
⏳ Starting deep research... (this may take a few minutes)
📊 The system will show progress updates during research
------------------------------------------------------------
[Research process with real-time updates...]
============================================================
📋 RESEARCH REPORT COMPLETED
============================================================
📄 Report Length: 3,247 characters
🔄 Iterations: 3
📚 Sources Used: 12
⏱️ Execution Time: 87.3 seconds
🎯 Status: completed
📖 RESEARCH REPORT:
------------------------------
# Latest Developments in Artificial Intelligence (2024)
## Executive Summary
The year 2024 has marked significant advances in artificial intelligence...
[Comprehensive research report continues...]
For a quick demonstration:
python chatbot.py --example| Command | Description |
|---|---|
help |
Show welcome message and instructions |
status |
Check system status and API configuration |
examples |
Display example research queries |
quit / exit |
Exit the chatbot |
- Technology: "What are the latest developments in artificial intelligence in 2024?"
- Science: "How does climate change affect global food security?"
- Comparison: "What are the key differences between quantum and classical computing?"
- Analysis: "What are the ethical implications of genetic engineering?"
- Current Events: "Describe recent advances in space exploration technology"
For programmatic access:
import asyncio
from langgraph_ttd_dr.interface import TTDResearcher
from langgraph_ttd_dr.client_factory import create_openai_client
async def research_example():
# Create client and researcher
client = create_openai_client()
researcher = TTDResearcher(
client=client,
max_iterations=5,
max_sources=15
)
# Conduct research
report, metadata = await researcher.research(
"What is the current state of renewable energy technology?"
)
print(f"Research completed with {len(metadata['all_sources'])} sources")
print(f"Iterations: {metadata['iterations']}")
print(f"Report: {report}")
# Run the example
asyncio.run(research_example())📦 langgraph_ttd_dr/
├── 🎛️ interface.py # Main TTDResearcher class
├── 🔗 client_factory.py # OpenAI/Azure client management
├── 📊 state.py # Research state management
├── 🔄 workflow.py # LangGraph workflow definition
├── 🧩 nodes.py # Individual workflow nodes
├── 💬 prompts.py # Centralized prompt management
├── 🔍 tools.py # Web search tools
└── 🛠️ utils.py # Utility functions
📄 chatbot.py # Simple usage example
📄 interactive_chatbot.py # Advanced interactive interface
- QueryClarificationNode: Improves and clarifies the research question
- PlannerNode: Creates a structured research plan
- NoisyDraftGeneratorNode: Generates the initial draft R₀
- DraftBasedQuestionGeneratorNode: Identifies gaps and generates search queries
- SearchAgentNode: Executes multi-engine web searches
- DenoisingUpdaterNode: Updates the draft with new information
- IterationControllerNode: Decides whether to continue or finalize
- ReportGeneratorNode: Produces the final research report
- 🔍 Tavily: High-quality, research-focused search results
- 🦆 DuckDuckGo: Privacy-focused web search (no API key required)
- 🔍 Naver: Korean and Asian content specialist
| Parameter | Default | Description |
|---|---|---|
max_iterations |
5 | Maximum research iterations |
max_sources |
15 | Maximum sources to collect |
search_results_per_gap |
3 | Results per knowledge gap |
recursion_limit |
50 | LangGraph recursion limit |
TTD-DR tracks multiple quality dimensions:
- Completeness: How thoroughly the topic is covered
- Accuracy: Factual correctness of information
- Relevance: How well content matches the query
- Coherence: Logical flow and organization
- Citation Quality: Source reliability and diversity
- Python: 3.8 or higher
- Dependencies: See
requirements.txt - APIs: OpenAI or Azure OpenAI (required), search APIs (optional)
- Memory: 2GB+ RAM recommended for complex research
researcher = TTDResearcher(
client=client,
search_engines=['tavily', 'duckduckgo'], # Customize search engines
search_results_per_gap=5, # More results per gap
max_iterations=10 # Longer research
)You can customize the research behavior by modifying prompts in langgraph_ttd_dr/prompts.py.
-
API Key Errors
❌ Failed to create client: No API key found
Solution: Check your
.envfile and ensure API keys are correctly set. -
Search Failures
❌ All search engines failed
Solution: Verify search API keys or rely on DuckDuckGo (no key required).
-
Long Processing Times
- Reduce
max_iterationsormax_sources - Use faster models (e.g.,
gpt-4o-miniinstead ofgpt-4o)
- Reduce
Enable debug logging:
export DEBUG=true
python chatbot.pyWe welcome contributions! Please see our contributing guidelines for details on:
- Code style and standards
- Testing requirements
- Pull request process
- Issue reporting
This project is licensed under the MIT License - see the LICENSE file for details.
- Original Research: Based on "Deep Researcher with Test-Time Diffusion" by Han et al. (2025)
- LangGraph: For the excellent workflow framework
- OpenAI/Azure: For powerful language models
- Search Providers: Tavily, DuckDuckGo, and Naver for search capabilities
- Research Community: For insights into diffusion-based approaches
- OptILLM Project: Referenced the deep research plugin for research engine architecture insights
- Issues: Report bugs and feature requests via GitHub Issues
- Discussions: Join community discussions in GitHub Discussions
- Documentation: See the
/docsfolder for detailed documentation
🚀 Ready to conduct deep research? Start with python chatbot.py and explore the power of TTD-DR!