This guide covers all configuration options available in the API Reliability Suite. The application uses Pydantic Settings for robust environment variable management and validation.
All settings can be configured via environment variables or a .env file in the project root.
When ENVIRONMENT is set to staging or production, the app enforces a non-default SECRET_KEY and a shared RATE_LIMIT_STORAGE_URI.
The following settings are defined in src/core/config.py:
| Variable | Default | Description |
|---|---|---|
PROJECT_NAME |
"API Reliability Suite" |
Application name, used in logs and as the OpenTelemetry Service Name. |
ENVIRONMENT |
"development" |
Deployment environment (development, test, staging, production). |
DEBUG |
False |
Enable debug mode. |
LOG_LEVEL |
"info" |
Logging level (debug, info, warning, error, critical). |
LOG_FILE_PATH |
"app.json" |
Path to the structured log file used by the AI summarizer and file logging handler. |
DATABASE_URL |
"sqlite+aiosqlite:///./data/reliability_suite.db" |
SQLAlchemy database URL. Use Postgres for shared or production-style environments. |
DATABASE_ECHO |
False |
Enables SQLAlchemy SQL logging. |
SEED_DEMO_USER |
True |
Seeds the demo admin account on startup for local runs. |
SECRET_KEY |
"change-me-in-production" |
Secret key used for JWT signing. Must be changed for production! |
ACCESS_TOKEN_EXPIRE_MINUTES |
30 |
JWT token expiration time in minutes. |
REFRESH_TOKEN_EXPIRE_DAYS |
14 |
Refresh-token lifetime used for session rotation. |
RATE_LIMIT_STORAGE_URI |
"memory://" |
Rate limit storage backend (use Redis in shared environments). |
RATE_LIMIT_HEADERS_ENABLED |
False |
Adds standard rate limit headers to responses. |
RATE_LIMIT_IN_MEMORY_FALLBACK_ENABLED |
False |
Allow in-memory fallback if storage is unavailable. |
RATE_LIMIT_KEY_PREFIX |
"api-reliability-suite" |
Prefix for rate limit keys in shared storage. |
TRUSTED_HOSTS |
"*" |
Comma-separated public hostnames allowed by TrustedHostMiddleware. |
CORS_ALLOW_ORIGINS |
"" |
Comma-separated origins allowed by CORS middleware. |
HTTPS_REDIRECT_ENABLED |
False |
Redirect incoming http traffic to https. |
SETTINGS_SECRETS_DIR |
None |
Optional secrets directory path (defaults to /run/secrets when present). |
| Variable | Default | Description |
|---|---|---|
OTLP_ENDPOINT |
None |
The OTLP collector endpoint (e.g., http://jaeger:4317). If not set, traces are exported to the console. |
PROMETHEUS_BASE_URL |
None |
Prometheus API base URL used by /slo/report to retrieve recording-rule values. |
CIRCUIT_BREAKER_CACHE_URL |
None |
Redis URL for cache-backed circuit-breaker fallback payloads. |
CIRCUIT_BREAKER_CACHE_TTL_SECONDS |
300 |
TTL for the cached upstream payload returned during degraded fallback. |
SLO_TARGET_SUCCESS_RATIO |
0.99 |
Availability target used in SLO/error-budget reporting. |
SLO_TARGET_P99_LATENCY_SECONDS |
1.0 |
Latency objective used in SLO/error-budget reporting. |
HTTP_CLIENT_TIMEOUT_SECONDS |
10.0 |
Default timeout for outbound HTTP requests. |
HTTP_CLIENT_MAX_CONNECTIONS |
20 |
Global connection cap for the shared outbound HTTP client. |
HTTP_CLIENT_MAX_KEEPALIVE_CONNECTIONS |
10 |
Keep-alive pool size for the shared outbound HTTP client. |
LLM_REQUEST_TIMEOUT_SECONDS |
20.0 |
Timeout for AI summarization requests. |
LLM_HEALTHCHECK_TIMEOUT_SECONDS |
5.0 |
Timeout for configured LLM provider readiness checks. |
LLM_MAX_RETRIES |
2 |
Retry count for provider SDK calls that support retries. |
LLM_MAX_CONCURRENCY |
4 |
Bulkhead limit for concurrent LLM summarization requests. |
ENABLE_LLM_READINESS_CHECKS |
True |
Include configured LLM provider health in /ready. |
To use the AI-powered CLI Debugger or the /debug/summarize-errors endpoint, you must provide at least one of the following keys:
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
None |
OpenAI API Key. |
GROQ_API_KEY |
None |
Groq API Key. |
GOOGLE_API_KEY |
None |
Google AI (Gemini) API Key. |
Note
The application automatically selects the first available provider in this order: GROQ_API_KEY, OPENAI_API_KEY, then GOOGLE_API_KEY.
Copy the provided example (if available) or create a .env file in the root directory:
PROJECT_NAME="My Reliability Template"
ENVIRONMENT="development"
LOG_LEVEL=debug
LOG_FILE_PATH=app.json
DATABASE_URL="postgresql+asyncpg://app:app@localhost:5432/reliability_suite"
SECRET_KEY=y0ur-5ecur3-k3y-h3r3
RATE_LIMIT_STORAGE_URI="redis://localhost:6379/0"
CIRCUIT_BREAKER_CACHE_URL="redis://localhost:6379/1"
PROMETHEUS_BASE_URL="http://localhost:9099"
GROQ_API_KEY=gsk_...Logging is pre-configured with the following defaults:
- Format: Structured JSON for files, human-readable console output.
- Log File: Uses
LOG_FILE_PATH(defaults toapp.json) and rotates at 10MB, keeping 5 backups. - Enrichment: Automatically includes
trace_idandspan_idfor every log entry if a trace is active.
Tracing is handled via OpenTelemetry:
- Exporter: OTLP (gRPC) if
OTLP_ENDPOINTis set; otherwise,ConsoleSpanExporter. - Instrumentation: Automatically instruments the FastAPI app.
The test suite uses its own configuration, often overriding settings in tests/conftest.py or via environment variables during the test run.
To run tests with code coverage:
make test# Verify current settings (dump)
python -c "from src.core.config import settings; print(settings.model_dump())"For Docker and Kubernetes deployments, you can provide secrets as files by mounting them under /run/secrets or setting SETTINGS_SECRETS_DIR to a custom path. Each secret file should be named after its setting, for example:
/run/secrets/SECRET_KEY
/run/secrets/RATE_LIMIT_STORAGE_URI
When the API sits behind ingress or a TLS-terminating proxy, configure the middleware settings together:
TRUSTED_HOSTS=api.example.comCORS_ALLOW_ORIGINS=https://frontend.example.comHTTPS_REDIRECT_ENABLED=true
The middleware is only enabled when these settings are configured.
- Secret Management: Verify
SECRET_KEYis not using the default value. - Persistent Identity Store: Point
DATABASE_URLat Postgres or another server-grade relational database for shared environments. - Shared Rate Limiting: Use
RATE_LIMIT_STORAGE_URIwith Redis for distributed deployments. - Fallback Cache: Configure
CIRCUIT_BREAKER_CACHE_URLwith Redis for cache-backed degraded responses. - Log Level Alignment: Confirm
LOG_LEVELis set toinfoorwarningfor production stability. - LLM Connectivity: Ensure at least one valid API key for an LLM provider is present in the
.envfile. - Tracing Setup: Verify
OTLP_ENDPOINTpoints to a valid collector if distributed tracing is required. - Trusted Hosts: Replace
TRUSTED_HOSTS=*with the real public hostnames for the deployment. - CORS Policy: Restrict
CORS_ALLOW_ORIGINSto the frontends that actually call the API. - Dependency Checks: Keep
ENABLE_LLM_READINESS_CHECKS=truewhen AI summarization is a required dependency.