Every adapter creates its own httpx.AsyncClient with ad-hoc error handling. Need a shared networking layer.
Problems
- No unified rate limiting — each adapter guesses independently
- 404s not handled gracefully (ProPublica blows up on special chars)
- No retry with backoff on transient failures
- No connection pooling across adapters
- DEMO_KEY shared across all api.data.gov users — we hit limits fast
- Crossref runs all entities even when dedup/confidence filters should reduce them
Solution
- Shared
APIClient class wrapping httpx with per-host rate limiters
- Token bucket or leaky bucket rate limiting per domain
- Automatic retry on 429/503 with exponential backoff
- URL-encode entity names properly before sending
- Connection pooling via shared client instances
- Log warnings (not errors) for expected failures
Every adapter creates its own httpx.AsyncClient with ad-hoc error handling. Need a shared networking layer.
Problems
Solution
APIClientclass wrapping httpx with per-host rate limiters