Skip to content

Latest commit

 

History

History
157 lines (118 loc) · 7.8 KB

File metadata and controls

157 lines (118 loc) · 7.8 KB

Configuration Guide

This guide covers all configuration options available in the API Reliability Suite. The application uses Pydantic Settings for robust environment variable management and validation.


📄 Environment Variables

All settings can be configured via environment variables or a .env file in the project root. When ENVIRONMENT is set to staging or production, the app enforces a non-default SECRET_KEY and a shared RATE_LIMIT_STORAGE_URI.

Core Application Settings

The following settings are defined in src/core/config.py:

Variable Default Description
PROJECT_NAME "API Reliability Suite" Application name, used in logs and as the OpenTelemetry Service Name.
ENVIRONMENT "development" Deployment environment (development, test, staging, production).
DEBUG False Enable debug mode.
LOG_LEVEL "info" Logging level (debug, info, warning, error, critical).
LOG_FILE_PATH "app.json" Path to the structured log file used by the AI summarizer and file logging handler.
DATABASE_URL "sqlite+aiosqlite:///./data/reliability_suite.db" SQLAlchemy database URL. Use Postgres for shared or production-style environments.
DATABASE_ECHO False Enables SQLAlchemy SQL logging.
SEED_DEMO_USER True Seeds the demo admin account on startup for local runs.
SECRET_KEY "change-me-in-production" Secret key used for JWT signing. Must be changed for production!
ACCESS_TOKEN_EXPIRE_MINUTES 30 JWT token expiration time in minutes.
REFRESH_TOKEN_EXPIRE_DAYS 14 Refresh-token lifetime used for session rotation.
RATE_LIMIT_STORAGE_URI "memory://" Rate limit storage backend (use Redis in shared environments).
RATE_LIMIT_HEADERS_ENABLED False Adds standard rate limit headers to responses.
RATE_LIMIT_IN_MEMORY_FALLBACK_ENABLED False Allow in-memory fallback if storage is unavailable.
RATE_LIMIT_KEY_PREFIX "api-reliability-suite" Prefix for rate limit keys in shared storage.
TRUSTED_HOSTS "*" Comma-separated public hostnames allowed by TrustedHostMiddleware.
CORS_ALLOW_ORIGINS "" Comma-separated origins allowed by CORS middleware.
HTTPS_REDIRECT_ENABLED False Redirect incoming http traffic to https.
SETTINGS_SECRETS_DIR None Optional secrets directory path (defaults to /run/secrets when present).

Observability Configuration

Variable Default Description
OTLP_ENDPOINT None The OTLP collector endpoint (e.g., http://jaeger:4317). If not set, traces are exported to the console.
PROMETHEUS_BASE_URL None Prometheus API base URL used by /slo/report to retrieve recording-rule values.
CIRCUIT_BREAKER_CACHE_URL None Redis URL for cache-backed circuit-breaker fallback payloads.
CIRCUIT_BREAKER_CACHE_TTL_SECONDS 300 TTL for the cached upstream payload returned during degraded fallback.
SLO_TARGET_SUCCESS_RATIO 0.99 Availability target used in SLO/error-budget reporting.
SLO_TARGET_P99_LATENCY_SECONDS 1.0 Latency objective used in SLO/error-budget reporting.
HTTP_CLIENT_TIMEOUT_SECONDS 10.0 Default timeout for outbound HTTP requests.
HTTP_CLIENT_MAX_CONNECTIONS 20 Global connection cap for the shared outbound HTTP client.
HTTP_CLIENT_MAX_KEEPALIVE_CONNECTIONS 10 Keep-alive pool size for the shared outbound HTTP client.
LLM_REQUEST_TIMEOUT_SECONDS 20.0 Timeout for AI summarization requests.
LLM_HEALTHCHECK_TIMEOUT_SECONDS 5.0 Timeout for configured LLM provider readiness checks.
LLM_MAX_RETRIES 2 Retry count for provider SDK calls that support retries.
LLM_MAX_CONCURRENCY 4 Bulkhead limit for concurrent LLM summarization requests.
ENABLE_LLM_READINESS_CHECKS True Include configured LLM provider health in /ready.

AI/LLM Provider Keys

To use the AI-powered CLI Debugger or the /debug/summarize-errors endpoint, you must provide at least one of the following keys:

Variable Default Description
OPENAI_API_KEY None OpenAI API Key.
GROQ_API_KEY None Groq API Key.
GOOGLE_API_KEY None Google AI (Gemini) API Key.

Note

The application automatically selects the first available provider in this order: GROQ_API_KEY, OPENAI_API_KEY, then GOOGLE_API_KEY.


🔧 Configuration Files

.env File

Copy the provided example (if available) or create a .env file in the root directory:

PROJECT_NAME="My Reliability Template"
ENVIRONMENT="development"
LOG_LEVEL=debug
LOG_FILE_PATH=app.json
DATABASE_URL="postgresql+asyncpg://app:app@localhost:5432/reliability_suite"
SECRET_KEY=y0ur-5ecur3-k3y-h3r3
RATE_LIMIT_STORAGE_URI="redis://localhost:6379/0"
CIRCUIT_BREAKER_CACHE_URL="redis://localhost:6379/1"
PROMETHEUS_BASE_URL="http://localhost:9099"
GROQ_API_KEY=gsk_...

Logging Configuration (src/core/logging.py)

Logging is pre-configured with the following defaults:

  • Format: Structured JSON for files, human-readable console output.
  • Log File: Uses LOG_FILE_PATH (defaults to app.json) and rotates at 10MB, keeping 5 backups.
  • Enrichment: Automatically includes trace_id and span_id for every log entry if a trace is active.

Tracing Configuration (src/core/tracing.py)

Tracing is handled via OpenTelemetry:

  • Exporter: OTLP (gRPC) if OTLP_ENDPOINT is set; otherwise, ConsoleSpanExporter.
  • Instrumentation: Automatically instruments the FastAPI app.

🧪 Testing Configuration

The test suite uses its own configuration, often overriding settings in tests/conftest.py or via environment variables during the test run.

To run tests with code coverage:

make test

🚀 Quick Config Commands

# Verify current settings (dump)
python -c "from src.core.config import settings; print(settings.model_dump())"

🔐 Secrets Files

For Docker and Kubernetes deployments, you can provide secrets as files by mounting them under /run/secrets or setting SETTINGS_SECRETS_DIR to a custom path. Each secret file should be named after its setting, for example:

/run/secrets/SECRET_KEY
/run/secrets/RATE_LIMIT_STORAGE_URI

🌐 Reverse Proxy Settings

When the API sits behind ingress or a TLS-terminating proxy, configure the middleware settings together:

  • TRUSTED_HOSTS=api.example.com
  • CORS_ALLOW_ORIGINS=https://frontend.example.com
  • HTTPS_REDIRECT_ENABLED=true

The middleware is only enabled when these settings are configured.


📋 Production Readiness Standard

Operational Configuration Audit

  • Secret Management: Verify SECRET_KEY is not using the default value.
  • Persistent Identity Store: Point DATABASE_URL at Postgres or another server-grade relational database for shared environments.
  • Shared Rate Limiting: Use RATE_LIMIT_STORAGE_URI with Redis for distributed deployments.
  • Fallback Cache: Configure CIRCUIT_BREAKER_CACHE_URL with Redis for cache-backed degraded responses.
  • Log Level Alignment: Confirm LOG_LEVEL is set to info or warning for production stability.
  • LLM Connectivity: Ensure at least one valid API key for an LLM provider is present in the .env file.
  • Tracing Setup: Verify OTLP_ENDPOINT points to a valid collector if distributed tracing is required.
  • Trusted Hosts: Replace TRUSTED_HOSTS=* with the real public hostnames for the deployment.
  • CORS Policy: Restrict CORS_ALLOW_ORIGINS to the frontends that actually call the API.
  • Dependency Checks: Keep ENABLE_LLM_READINESS_CHECKS=true when AI summarization is a required dependency.