Testing Guide: AWS V2 Hybrid Intelligence Platform

This document outlines the testing strategy, tools, and procedures for the AWS V2 project. Our testing philosophy focuses on Domain Isolation, Protocol Consistency, and Full-Flow Simulation to ensure the reliability of both deterministic workflows and exploratory agents.

1. Testing Philosophy

The platform is tested across five primary dimensions:

Unit Reliability: Ensuring core Pydantic models, utility helpers, LLMResponse DTOs, and individual scrapers function correctly in isolation.
Stateful Resilience: Verifying that Checkpoints (Workflows) and Sessions (Agents) persist and resume correctly.
Data Orchestration: Validating the L1/L2 separation and the integrity of the mediated DataCache.
Intelligence Routing: Confirming the IntelligenceRouter correctly classifies tasks, routes to optimal LLMs (local/cloud), and applies output parsing.
Full-Flow Integration: Bridging the API Gateway, JobManager, and MCP Tools to ensure end-to-end correctness.

2. Environment Setup

To run tests, ensure you have the necessary testing dependencies installed:

source venv311/bin/activate
pip install pytest pytest-asyncio pytest-mock

All tests should be executed from the project root with the PYTHONPATH set to the current directory to ensure absolute imports resolve correctly.

3. Test Categories & Execution

A. Core Unit Tests

Test the foundational building blocks and utilities.

Location: tests/test_core_models.py, tests/test_core_utils.py, tests/test_core_telemetry.py

Command:

PYTHONPATH=. venv311/bin/pytest tests/test_core_models.py tests/test_core_utils.py tests/test_core_telemetry.py

B. Stateful Management Tests

Verify session persistence for Agents and execution checkpoints for Workflows.

Location: tests/test_agent_session.py, tests/test_workflow_engine.py, tests/test_checkpoint_resume.py

Command:

PYTHONPATH=. venv311/bin/pytest tests/test_agent_session.py tests/test_workflow_engine.py tests/test_checkpoint_resume.py

Note: AgentSession now tracks cloud_token_usage separately from token_usage. Tests should verify that local model tokens do not increment cloud_token_usage.

C. Data Orchestration (L1/L2) Tests

Validates that L1 scrapers write to the cache and L2 calculators consume from it correctly.

Location: tests/test_l1_l2_cache.py

Command:

PYTHONPATH=. venv311/bin/pytest tests/test_l1_l2_cache.py

D. LLM Provider Tests

Verifies individual LLM providers (GeminiProvider, ClaudeProvider, LlamaCppProvider) correctly initialize and return LLMResponse objects.

Location: tests/test_local_llm_direct.py (and potentially new cloud provider tests)

Command:

PYTHONPATH=. venv311/bin/pytest tests/test_local_llm_direct.py

E. Intelligence Routing & Processors Tests

Confirms the IntelligenceRouter's task classification, model routing, and specialized Processors (like the Monopoly Analyzer) are functioning correctly.

Location: tests/test_gemini_advanced_pricing.py (Price Manager), tests/test_monopoly_analyzer.py, src/intelligence/router/ (Router logic).
Key Coverage:
- Advanced Pricing: Verifies thinking tokens, prompt caching, and tiered pricing calculations.
- Heuristics: Confirms keyword and length-based pre-screening rules trigger correctly.
- Processors: Verifies CategoryMonopolyAnalyzer correctly calculates dynamic CR3, CV pricing, and relative review disparity.

Command:

PYTHONPATH=. venv311/bin/python -m pytest tests/test_gemini_advanced_pricing.py tests/test_monopoly_analyzer.py

F. Import Integrity Tests

Ensures that the Domain-Driven Design (DDD) structure remains free of circular imports.

Command:

PYTHONPATH=. venv311/bin/pytest tests/test_imports.py

G. Rate Limiting System Tests

Validates all three layers of the rate limiting architecture in isolation and combination.

Location: tests/test_rate_limiting_system.py

Coverage (37 tests):

Test Class	Cases	What Is Verified
`TestLayer3TokenBucket`	5	Token acquisition, burst depletion, refill over time, source isolation
`TestLayer2TenantQuota`	7	Counter increment, free/pro limits, unknown tier unlimited, tenant isolation
`TestLayer1aCooldown`	7	First trigger allowed, repeat blocked, different chats independent, blocked call does not reset timer
`TestLayer1bConcurrentSlot`	9	Normal release, release on exception, release on `CancelledError`, per-chat limit, global limit, multi-chat coexistence
`TestCheckLimit`	5	Combined gate: cooldown blocked → quota counter must NOT advance
`TestUnifiedRequestMetadata`	4	`entry_type` / `chat_id` field propagation and model serialisation

Command:

export PYTHONPATH=$PYTHONPATH:. && .venv311/bin/python3 -m unittest tests/test_rate_limiting_system.py -v

Note: Tests reset the RateLimiter singleton state in setUp (_concurrent, _tenant_counters, _chat_last, bucket tokens) to ensure isolation between runs.

H. Full-Flow Integration Tests

Simulates a complete request starting from the entry points through the API Gateway.

Scenario: Simulate a Feishu command, track its progress via the Telemetry Tracker, and verify the final Bitable/CSV output.

Command:

PYTHONPATH=. venv311/bin/pytest tests/test_feishu_full_flow.py -s

4. Writing New Tests

When adding new capabilities, follow these standards:

Asynchronous by Default: Use @pytest.mark.asyncio for any test involving MCP Tools, Scrapers, or LLMs.
Mock External I/O:
- Mock curl_cffi requests in scrapers to avoid network dependency.
- Mock LLM Provider responses to avoid token costs (especially in IntelligenceRouter tests).
- Use tempfile or local directory mocks for DataCache and AgentSession tests.
Schema Validation: When testing new MCP Tools, always validate the arguments against the tool's inputSchema.
Gateway Dispatch: When testing entry points, always use APIGateway.dispatch_* methods to ensure you are testing the full production path.

5. Troubleshooting Tests

ModuleNotFoundError: Ensure you are running with PYTHONPATH=..
State Leakage: If session/checkpoint tests fail, clear the data/sessions/ and data/checkpoints/ directories before re-running.
Asyncio Warnings: Ensure you have pytest-asyncio installed and configured correctly.
Local LLM Issues: If LlamaCppProvider fails to load or respond:
- Verify LOCAL_MODEL_PATH in .env points to the absolute path of your .gguf model file.
- Check llama-cpp-python installation and GPU support (n_gpu_layers).
- Use tests/test_local_llm_direct.py for isolated troubleshooting.
Cloud LLM SDK Issues: If GeminiProvider or ClaudeProvider fail during initialization or generation:
- Check for AttributeError: module 'google.genai' has no attribute 'configure' or 'Model' object has no attribute 'supported_generation_methods'. This indicates an google-generativeai SDK version mismatch. Consider pip install google-generativeai --upgrade or ensure compatibility with older APIs.
- Verify GEMINI_API_KEY or ANTHROPIC_API_KEY in .env.
Feishu Output Formatting: If messages fail to send or update (invalid message content, NOT a card):
- Confirm IntelligenceRouter correctly applies OutputParser.clean_for_feishu to LLM responses.
- Ensure FeishuCallback is using send_card_message and update_card_message for dynamic updates, as Feishu's patch API requires interactive card format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Guide: AWS V2 Hybrid Intelligence Platform

1. Testing Philosophy

2. Environment Setup

3. Test Categories & Execution

A. Core Unit Tests

B. Stateful Management Tests

C. Data Orchestration (L1/L2) Tests

D. LLM Provider Tests

E. Intelligence Routing & Processors Tests

F. Import Integrity Tests

G. Rate Limiting System Tests

H. Full-Flow Integration Tests

4. Writing New Tests

5. Troubleshooting Tests

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Testing Guide: AWS V2 Hybrid Intelligence Platform

1. Testing Philosophy

2. Environment Setup

3. Test Categories & Execution

A. Core Unit Tests

B. Stateful Management Tests

C. Data Orchestration (L1/L2) Tests

D. LLM Provider Tests

E. Intelligence Routing & Processors Tests

F. Import Integrity Tests

G. Rate Limiting System Tests

H. Full-Flow Integration Tests

4. Writing New Tests

5. Troubleshooting Tests