"Instead of just writing APIs on top of servers like Nginx or Node.js, this project goes one level deeper β we built the HTTP server itself, from scratch, in C++."
It is lightweight, low-level, and designed to handle high concurrency with adaptive self-protection features.
-
Custom HTTP Server in C++
- Built using raw sockets and Windows IOCP (I/O Completion Ports)
- Handles thousands of concurrent connections efficiently
-
Fully Async I/O (WSASend + WSARecv)
- Non-blocking sends via overlapped I/O β worker threads never block on
send() - Partial send handling β automatically resumes if socket buffer is full
- Keep-alive support β connections reset cleanly for next request
- Non-blocking sends via overlapped I/O β worker threads never block on
-
Sharded Rate Limiter (Lock Striping)
- 32 independent shards β only 1/32 of workers contend on the same mutex
- Per-IP adaptive state with exponential moving average (EMA) thresholds
- Runtime enable/disable toggle β zero overhead when disabled
-
Optimized Request Parsing
std::string_viewthroughout β eliminates ~80% of string allocations- Single-pass header scanning β no repeated
find()scans std::from_charsfor numeric parsing β no temporary string creation
-
Static File Preloading
- All static files loaded into RAM at startup
- Zero disk I/O during request handling β served from memory cache
- Cache is read-only at runtime β no locks needed
-
IP-Based Limiter
- Tracks IP addresses with their own adaptive state
- Fair admission for multiple clients from same IP
- Periodic eviction of stale IP entries
-
Overload Admission Control
- On
accept(), if the server is overloaded:- Gracefully responds with HTTP 429: Too Many Requests
- Closes the connection politely
- Prevents the server from being overwhelmed
- On
-
Separation of Concerns
- Request parsing, connection handling, and rate limiting are modular
- Easy to extend for future improvements
-
Connection Lifecycle
- Client connects β checked by IPLimiter + admission control
- If accepted β request handled asynchronously with IOCP worker threads
-
Adaptive Control Layers
- Per-request adaptive limiter (per client)
- IP-based limiter (global across clients from same IP)
- Admission control at accept() (reject excess load early)
-
Concurrency Handling
- Uses Windows IOCP threads to scale with CPU cores
- Fully async sends and receives β no blocking operations
flowchart LR
Client --> AdmissionControl --> IOCPThreadPool --> HTTPParser --> Router --> AsyncWSASend --> Response
- Most developers never touch raw server internals β they just deploy APIs on existing frameworks.
- Here, the entire HTTP server core was implemented manually in C++, a low-level systems language.
- The design introduces novel adaptive mechanisms uncommon in scratch-built servers:
- πΉ Adaptive per-client rate limiter
- πΉ IP-aware fairness to prevent abuse
- πΉ Overload admission control with graceful rejection
This isn't just another "toy server" β it demonstrates how production-grade concepts (admission control, fairness, overload handling) can be built at the raw systems level while still offering an Express.js-like developer experience.
The server has been significantly optimized from the baseline:
- Before: Blocking
send()calls stalled worker threads - After: Fully overlapped
WSASend()via IOCP β threads never block - Impact: 2β3Γ throughput improvement under load
- Before: Single global mutex β all workers serialized
- After: 32 independent shards β contention reduced by 32Γ
- Impact: Scales linearly with core count
- Before: 10β20+ heap allocations per request (
split(),substr()) - After:
std::string_view+ single-pass scanning β 3β5 allocations minimum - Impact: ~80% fewer allocations, better cache utilization
- Before: Disk read on every static file request
- After: All files loaded into RAM at startup β cache-first, zero-lock reads
- Impact: Eliminates disk I/O latency entirely
This project takes a different route: we built the web server itself β from scratch, in pure C++.
- Producer-consumer scheduling with smart request pipelines
- Zero-copy file serving (
TransmitFileon Windows) - Self-tuning worker thread pool
- Advanced congestion control (dynamic backlog admission)
- Smart request scheduling, prioritizing fast GET requests over POST requests
- Language: C++20
- Concurrency Model: IOCP (Windows I/O Completion Ports)
- Networking: Winsock2 async I/O (WSASend, WSARecv)
- Build System: CMake
Note: It will work in windows only, and it will need C++20 version with windows API. You don't need to do anything, just get mingw latest compiler from winlibs.com.
git clone https://github.com/SrabanMondal/adaptive-cpp-http-server.git
cd adaptive-cpp-http-server
mkdir build && cd build
cmake ..
cmake --build .cd ..
/build/adaptive-cpp-http-server.exeServer starts on default port 3000. But you can change it in main.cpp. Try with:
curl http://localhost:3000/This server isn't just sockets and bytes β it comes with a clean, high-level API inspired by popular web frameworks like Express.js.
Example: define routes in a few lines:
Router router;
Server server(3000, router);
// Simple HTML route
router.addRoute("/", HttpMethod::GET, [](const HttpRequest& req) {
return HtmlResponse("<h1>Hello from C++ HTTP server!</h1>");
});
// Serve binary data
router.addRoute("/favicon.ico", HttpMethod::GET, [&](const HttpRequest& req) {
std::vector<char> data = ft.readBinaryFile("favicon.ico");
return BinaryResponse(data, "image/x-icon", 200);
});
// Return JSON
router.addRoute("/getData", HttpMethod::GET, [](const HttpRequest& req) {
std::string json = R"({"name":"C++ server","version":0.1})";
return JsonResponse(json);
});
// Handle POST with JSON body
router.addRoute("/putData", HttpMethod::POST, [](const HttpRequest& req) {
if (req.contentType != "application/json")
return JsonResponse(R"({"message":"Need Json body"})", 400);
auto bodyOpt = req.parseJsonBody();
if (!bodyOpt) return JsonResponse(R"({"message":"Bad JSON"})", 400);
json body = *bodyOpt;
std::string name = body.value("name", "");
return JsonResponse(R"({"message":"Name received )" + name + "\"}");
});- Plug-and-play routes β Add GET, POST, PUT, DELETE handlers with just lambdas.
- Multiple response types β JSON, HTML, text, binary.
- Extendable β CORS, middleware, authentication, DB integration can be added on top easily.
- Scalable abstraction β Beginner-friendly for API devs, yet built on low-level IOCP for performance.
The server was benchmarked with wrk to measure raw throughput.
- Server: Windows 11, Intel i5-12400, 16 GB RAM
- Client: Ubuntu WSL2 (wrk)
- Network: Private IP over WSL2 virtual adapter
- Build: C++20 (MinGW), Windows IOCP, fully async WSASend
| Connections | Threads | Avg Latency | Stdev | Max Latency | Req/Sec | Transfer/sec |
|---|---|---|---|---|---|---|
| 200 | 8 | 30.84ms | 162.89ms | 1.56s | 52,712 | 137.84 MB/s |
| 400 | 8 | 9.16ms | 9.72ms | 143.09ms | 52,838 | 138.17 MB/s |
| 800 | 8 | 16.26ms | 11.67ms | 341.06ms | 50,408 | 131.82 MB/s |
wrk -t8 -c400 -d30s http://127.0.0.1:3000/Measured without --latency for maximum throughput accuracy
| Connections | P50 | P75 | P90 | P99 | Avg Latency |
|---|---|---|---|---|---|
| 200 | 3.93ms | 7.18ms | 13.23ms | 30.90ms | 5.88ms |
| 400 | 7.21ms | 11.73ms | 18.70ms | 35.10ms | 8.92ms |
| 800 | 15.23ms | 19.91ms | 28.43ms | 46.05ms | 16.07ms |
wrk -t8 -c400 -d30s http://127.0.0.1:3000/ --latencyMeasured with --latency; throughput is slightly lower due to measurement overhead
Below are raw wrk --latency outputs for each concurrency level, showing full latency distribution and tail behavior (P99).
Observations:
- Latency remains tightly distributed at moderate concurrency (200β400 connections)
- Tail latency (P99) increases under higher load but remains under 50ms at 800 connections
- Throughput scales with concurrency, with predictable latency trade-offs
- Reported throughput here is slightly lower than the main results, as enabling
--latencyintroduces measurement overhead
- Peak throughput: ~52,800 requests/sec at 400 connections
- Median latency: 9.16ms at 400 connections
- Total transfer: ~138 MB/sec sustained
Contributions, issues, and feature requests are welcome! Feel free to fork the repo and submit a PR.
GNU License Β© 2025 Sraban Mondal
This project bridges the gap between low-level systems programming and high-level web development.


