Skip to content

Api rate limiting middleware with token bucket and per endp0int tiers#120

Open
ekwe7 wants to merge 4 commits into
VeriNode-Labs:mainfrom
ekwe7:API-Rate-Limiting-Middleware-with-Token-Bucket-and-Per-Endp0int-Tiers
Open

Api rate limiting middleware with token bucket and per endp0int tiers#120
ekwe7 wants to merge 4 commits into
VeriNode-Labs:mainfrom
ekwe7:API-Rate-Limiting-Middleware-with-Token-Bucket-and-Per-Endp0int-Tiers

Conversation

@ekwe7

@ekwe7 ekwe7 commented Jun 27, 2026

Copy link
Copy Markdown

close #77

Summary
This PR introduces a configurable rate limiting framework for public API endpoints using a token bucket algorithm. The solution protects the platform from abusive traffic patterns while supporting subscription-based rate limits, burst handling, and distributed deployments through an optional Redis backend.

The implementation enables consistent traffic control across endpoints and provides standards-compliant responses when limits are exceeded.

Problem Statement

Public API endpoints currently lack rate limiting protections, allowing abusive or misconfigured clients to consume disproportionate resources and negatively impact service availability for other users.

Without request throttling:

Excessive traffic can degrade API performance.

Resource starvation can affect legitimate clients.

Traffic spikes can overwhelm backend services.

Distributed deployments cannot enforce shared rate limits consistently.

Solution

This PR adds a configurable token bucket rate limiter with:

Per-endpoint rate limit policies

Tier-based request quotas

Burst traffic handling

In-memory storage support

Optional Redis-backed distributed enforcement

Standards-compliant HTTP 429 responses

Features

Token Bucket Rate Limiter

Implemented a token bucket algorithm with configurable refill rates to provide predictable and efficient request throttling.

Capabilities include:

Dynamic token replenishment

Configurable capacity limits

Burst traffic support

Fair request allocation

Tier-Based Limits

Added support for predefined service tiers:

Tier

Limit

Free

10 requests/minute

Pro

100 requests/minute

Enterprise

1000 requests/minute

Endpoint configurations can apply different tiers depending on API requirements.

Burst Handling

Supports temporary bursts up to:

2× the sustained request rate

Maximum burst duration of 10 seconds

This allows short-lived spikes without unnecessarily rejecting valid traffic.

Distributed Rate Limiting

Added an optional Redis backend to support:

Multi-instance deployments

Shared rate limit state

Consistent enforcement across servers

Horizontal scalability

When Redis is unavailable or disabled, the middleware falls back to the in-memory implementation.

Standards-Compliant Responses

Requests exceeding configured limits receive:

HTTP 429 (Too Many Requests)

Retry-After header indicating when requests may be retried

Example:

HTTP/1.1 429 Too Many Requests

Retry-After: 30

Middleware Integration

Introduced configurable middleware capable of:

Applying endpoint-specific limits

Selecting rate limit tiers dynamically

Supporting future custom tier definitions

Enforcing consistent throttling behavior across APIs

Testing

Added comprehensive test coverage for:

Burst Scenarios

Temporary traffic spikes

Burst capacity exhaustion

Burst recovery behavior

Cooldown and Refill Behavior

Token replenishment timing

Retry-after calculations

Sustained traffic patterns

Edge Cases

Empty buckets

Exact threshold boundaries

Rapid request sequences

Redis-backed enforcement

In-memory enforcement

Concurrent request handling

Technical Details

Algorithm

Token Bucket

Storage Backends

In-Memory

Redis (Optional)

Rate Limits

Free: 10 requests/minute

Pro: 100 requests/minute

Enterprise: 1000 requests/minute

Burst Capacity

2× sustained rate

Up to 10 seconds

Rate Limit Response

HTTP 429

Retry-After header included

Acceptance Criteria

Token bucket rate limiter implemented

Configurable refill rates supported

Middleware supports per-endpoint configuration

Free tier limits enforced

Pro tier limits enforced

Enterprise tier limits enforced

Burst traffic handling implemented

Redis backend added

In-memory backend retained

HTTP 429 responses returned when limits are exceeded

Retry-After header included

Tests added for burst, cooldown, and edge cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API Rate Limiting Middleware with Token Bucket and Per-Endpoint Tiers

2 participants