Api rate limiting middleware with token bucket and per endp0int tiers#120
Open
ekwe7 wants to merge 4 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
close #77
Summary
This PR introduces a configurable rate limiting framework for public API endpoints using a token bucket algorithm. The solution protects the platform from abusive traffic patterns while supporting subscription-based rate limits, burst handling, and distributed deployments through an optional Redis backend.
The implementation enables consistent traffic control across endpoints and provides standards-compliant responses when limits are exceeded.
Problem Statement
Public API endpoints currently lack rate limiting protections, allowing abusive or misconfigured clients to consume disproportionate resources and negatively impact service availability for other users.
Without request throttling:
Excessive traffic can degrade API performance.
Resource starvation can affect legitimate clients.
Traffic spikes can overwhelm backend services.
Distributed deployments cannot enforce shared rate limits consistently.
Solution
This PR adds a configurable token bucket rate limiter with:
Per-endpoint rate limit policies
Tier-based request quotas
Burst traffic handling
In-memory storage support
Optional Redis-backed distributed enforcement
Standards-compliant HTTP 429 responses
Features
Token Bucket Rate Limiter
Implemented a token bucket algorithm with configurable refill rates to provide predictable and efficient request throttling.
Capabilities include:
Dynamic token replenishment
Configurable capacity limits
Burst traffic support
Fair request allocation
Tier-Based Limits
Added support for predefined service tiers:
Tier
Limit
Free
10 requests/minute
Pro
100 requests/minute
Enterprise
1000 requests/minute
Endpoint configurations can apply different tiers depending on API requirements.
Burst Handling
Supports temporary bursts up to:
2× the sustained request rate
Maximum burst duration of 10 seconds
This allows short-lived spikes without unnecessarily rejecting valid traffic.
Distributed Rate Limiting
Added an optional Redis backend to support:
Multi-instance deployments
Shared rate limit state
Consistent enforcement across servers
Horizontal scalability
When Redis is unavailable or disabled, the middleware falls back to the in-memory implementation.
Standards-Compliant Responses
Requests exceeding configured limits receive:
HTTP 429 (Too Many Requests)
Retry-After header indicating when requests may be retried
Example:
HTTP/1.1 429 Too Many Requests
Retry-After: 30
Middleware Integration
Introduced configurable middleware capable of:
Applying endpoint-specific limits
Selecting rate limit tiers dynamically
Supporting future custom tier definitions
Enforcing consistent throttling behavior across APIs
Testing
Added comprehensive test coverage for:
Burst Scenarios
Temporary traffic spikes
Burst capacity exhaustion
Burst recovery behavior
Cooldown and Refill Behavior
Token replenishment timing
Retry-after calculations
Sustained traffic patterns
Edge Cases
Empty buckets
Exact threshold boundaries
Rapid request sequences
Redis-backed enforcement
In-memory enforcement
Concurrent request handling
Technical Details
Algorithm
Token Bucket
Storage Backends
In-Memory
Redis (Optional)
Rate Limits
Free: 10 requests/minute
Pro: 100 requests/minute
Enterprise: 1000 requests/minute
Burst Capacity
2× sustained rate
Up to 10 seconds
Rate Limit Response
HTTP 429
Retry-After header included
Acceptance Criteria
Token bucket rate limiter implemented
Configurable refill rates supported
Middleware supports per-endpoint configuration
Free tier limits enforced
Pro tier limits enforced
Enterprise tier limits enforced
Burst traffic handling implemented
Redis backend added
In-memory backend retained
HTTP 429 responses returned when limits are exceeded
Retry-After header included
Tests added for burst, cooldown, and edge cases