Skip to content

04 Event Loop Design

Solis Dynamics edited this page May 15, 2026 · 2 revisions

04-Event-Loop-Design: Mastering the Reactor Pattern and Building Scalable Reactive Systems

Keywords: Event Loop, Reactor Pattern, Proactor Pattern, Netty, Java NIO, Selector, Epoll, Kqueue, IOCP, Non-blocking I/O, Thread-Per-Request, Boss/Worker Threads, Dispatch Loop, Backpressure, Throughput, Latency, Tail Latency, Fairness, Starvation, Mechanical Sympathy, Event-Driven Architecture, Concurrency, Saturation


🔍 Introduction

An event loop is one of the most important architectural patterns in high-performance systems.

It is the mechanism that allows a small number of threads to coordinate a very large number of events efficiently.

Instead of asking:

Which thread should handle this request?

a reactive system asks:

What is ready right now?

This is the core idea behind:

  • Java NIO
  • Netty
  • Vert.x
  • Spring WebFlux
  • reactive servers
  • high-concurrency gateways
  • low-latency messaging systems
  • websocket platforms
  • proxy servers
  • streaming systems

The event loop exists because thread-per-request architectures eventually hit a wall:

  • thread explosion
  • context switching overhead
  • memory pressure
  • unstable tail latency
  • poor scalability under idle connections
  • excessive scheduler contention

A well-designed event loop is not just a loop.

It is a control plane for:

  • I/O coordination
  • readiness detection
  • task dispatch
  • overload protection
  • fairness
  • latency control
  • CPU efficiency
  • connection management

This page explains how to design event loops that are fast, safe, and production-grade.

The C10K Problem: Event loops are the definitive architectural answer to the challenge of handling 10,000+ concurrent connections efficiently on a single machine.


⚖️ 1. The Paradigm Shift: Thread-Per-Request vs Event Loop

To understand why the event loop exists, you must understand what it replaces.

Thread-Per-Request vs Event Loop


❌ The Thread-Per-Request Model

Traditional servlet-based systems often use this architecture:

1 Request
↓
1 Thread
↓
Blocking I/O
↓
Business Logic
↓
Response

This looks simple, but it becomes expensive at scale.

Why it breaks down

  • each thread consumes memory
  • each thread competes for CPU scheduling
  • blocked threads waste resources
  • idle connections still occupy thread slots
  • context switching becomes dominant
  • stack memory grows quickly
  • throughput drops under load

A single Java thread may consume around 1 MB of stack memory in many real deployments.
10,000 concurrent idle connections can therefore consume enormous memory just to wait.


✅ The Event Loop Model

A non-blocking system uses a different architecture:

Connections
↓
Event Loop
↓
Ready Events
↓
Dispatch / Handoff
↓
Business Processing
↓
Response

Instead of waiting on every connection, the event loop monitors readiness and reacts only when there is actual work.

This creates several advantages:

  • fewer threads
  • lower context switching overhead
  • lower memory usage
  • better scalability with many idle connections
  • more predictable resource consumption

Comparison Table

Feature Thread-Per-Request Event Loop
Blocking Yes No
Context Switching High Very Low
Memory Footprint Heavy Lightweight
Scalability Limited by threads Limited by CPU, network, and downstream capacity
Fairness Depends on scheduling Explicitly designed
Tail Latency Often unstable under load Can be tightly controlled
Complexity Simpler linear code More architectural discipline required

⚙️ 2. The Reactor Pattern

The event loop is the implementation of the Reactor Pattern.

The Reactor responds to I/O events by dispatching them to the appropriate handler.

Architecture:

I/O Source
↓
Event Demultiplexer
↓
Event Loop / Reactor
↓
Handler
↓
Business Logic

The selector is the event demultiplexer.
The event loop reads the ready set and sends work to handlers.

This is the foundation of most high-performance non-blocking systems in Java.

Reactor vs Proactor Diagram
Visual 1.2: Reactor pattern (readiness-based) vs Proactor pattern (completion-based).

🔄 Reactor vs. Proactor: Two Sides of Async

While the Event Loop is the heart of the Reactor pattern, it's important to distinguish it from its cousin:

  • Reactor (Java NIO / Netty): The Loop waits for a resource to become ready (e.g., "data is available to read"). You perform the actual I/O.
  • Proactor (Windows IOCP / AIO): You tell the OS to perform the I/O in the background. The Loop is notified only when the operation is complete.

Note: Java's high-performance networking is almost entirely based on the Reactor pattern due to OS portability.


🔹 Single-Threaded Reactor

A single thread accepts connections, reads data, processes it, and writes the response.

This model is famously used in systems such as:

  • Redis
  • Node.js-style single-loop designs
  • some embedded or specialized high-performance services

Pros

  • zero lock contention inside the loop
  • simple state reasoning
  • predictable ordering

Cons

  • limited multi-core utilization
  • one slow task can freeze all connections assigned to the loop
  • not ideal for mixed I/O + CPU workloads

If processing one event takes 1 second, all other connections handled by that loop wait.


🔹 Multi-Threaded Reactor: Boss and Worker Groups

This is the architecture commonly used by Netty.

It separates the accepting phase from the processing phase.

Architecture flow:

Client
↓
Boss Event Loop Group
↓
Accept Connection
↓
Register Channel with Worker Event Loop Group
↓
Worker Handles Read / Write / Dispatch

Boss Event Loop Group

  • usually a very small pool
  • often just 1 thread
  • listens for incoming connections
  • handles OP_ACCEPT
  • hands accepted channels to workers

Worker Event Loop Group

  • handles actual read/write readiness
  • processes OP_READ, OP_WRITE, OP_CONNECT
  • usually sized based on CPU cores
  • keeps each channel bound to a stable loop for locality

🧠 3. Inside the Event Loop

An event loop is literally a repeating control cycle.

Simplified pseudo-code:

while (!isShutdown) {
    Set<SelectionKey> readyKeys = selector.select();

    processSelectedKeys(readyKeys);

    runAllTasks();
}

Event Loop Cycle Diagram
Visual 1.3: The high-performance cycle: Polling events → Dispatching handlers → Running scheduled tasks.

The loop does three major things:

  1. waits for events from the OS,
  2. processes ready I/O events,
  3. runs scheduled asynchronous tasks.

This is the mechanical core of the reactor.

⚖️ Scheduling, Fairness & The Epoll Bug

A production-grade Event Loop doesn't just process I/O; it must manage tasks fairly.

  • Task Slicing: To prevent one massive task from "starving" others, Netty uses an ioRatio. It limits how much time is spent on non-I/O tasks vs. I/O events in a single cycle.
  • The "JDK Epoll Bug" Fix: There is a famous bug where Selector.select() wakes up for no reason, causing 100% CPU usage. Netty detects this "spinning" and automatically rebuilds the Selector on the fly.

🔹 Mechanical Sympathy & Epoll/Kqueue

Event loops achieve mechanical sympathy by leveraging OS-level mechanisms like epoll (Linux) and kqueue (macOS). This ensures zero wasted CPU cycles on idle connections.


🏗️ 4. How the Selector and Event Loop Work Together

The event loop usually sits on top of a Selector.

The selector detects readiness.
The event loop decides what to do with it.

Relationship:

Selector = readiness detection
Event Loop = dispatch and control

A selector by itself is not enough.
The event loop gives it policy, fairness, and lifecycle control.


🔄 5. Event Loop Lifecycle

A production event loop usually follows this lifecycle:


Step 1: Wait for Events

The loop blocks efficiently while waiting for readiness notifications.

In Java NIO, this is usually done through:

selector.select();

This means:

  • sleep without burning CPU
  • wake up when channels are ready
  • continue only when work exists

Step 2: Read Ready Keys

Once awakened, the loop retrieves ready events.

Example:

Set<SelectionKey> selectedKeys = selector.selectedKeys();

These keys represent channels that are ready for operation.


Step 3: Dispatch Events

Each key is inspected and sent to the correct handler.

Possible event types:

  • accept
  • read
  • write
  • connect

The event loop must make dispatch decisions quickly.


Step 4: Remove Processed Keys

Processed keys must be removed from the selected set.

Otherwise, the same event may be processed repeatedly.

This is one of the most common bugs in NIO-based systems.


Step 5: Repeat

The loop continues forever or until shutdown is requested.


⚡ 6. Event Types in Java NIO

Java NIO event loops usually work with these readiness events:

| Event | Meaning | |---|---|---| | OP_ACCEPT | A new incoming connection is ready | | OP_CONNECT | A connection has been established | | OP_READ | Data is available to read | | OP_WRITE | Socket is ready to accept more data |

Each event type requires different handling logic.
The loop must route each one correctly.


🧩 7. Readiness vs Work

This distinction is critical:

  • readiness means the channel can be operated on
  • work means the actual processing you do after readiness

The event loop detects readiness.
The handler performs work.

Do not confuse them.

If you mix readiness detection with heavy processing, your loop becomes slow and unstable.


🔄 8. The Dispatch Model

The event loop is not just a polling machine.
It is a dispatching machine.

Typical structure:

Selector
↓
Ready Key Set
↓
Event Dispatcher
↓
Handler
↓
Task Queue or Worker Pool

A good dispatch model separates:

  • I/O coordination
  • protocol parsing
  • business logic
  • persistence
  • downstream calls

This separation keeps the loop responsive.


🧠 9. Why Event Loops Scale Better

Event loops scale well because they reduce:

  • thread count
  • blocking time
  • scheduling overhead
  • memory usage
  • lock contention

Instead of many idle threads, you have a smaller number of active coordination threads.

This is especially valuable in workloads with:

  • many idle connections
  • bursty traffic
  • chat systems
  • gateways
  • proxy services
  • websocket servers
  • streaming platforms
  • SSE connections
  • IoT device fleets

⚖️ 10. Event Loops vs Thread Pools

These are not the same thing.


Event Loop

Purpose:

Manage readiness and dispatch

Best for:

  • I/O coordination
  • non-blocking sockets
  • readiness polling

Thread Pool

Purpose:

Execute independent tasks

Best for:

  • CPU-bound work
  • blocking work
  • background processing
  • parallel computation

A strong architecture usually combines both:

Event Loop → Worker Pool → Business Logic

🧩 11. Event Loop Architecture Pattern

A production-grade architecture often looks like this:

Client Connections ➔ Event Loop ➔ Dispatch ➔ Worker Thread Pool ➔ Business Logic ➔ Response

Event Loop Architecture Pattern Summary

This separation is critical.

If the event loop starts doing business logic itself, performance degrades quickly.

If the worker pool is unbounded, overload spreads.

If the dispatch layer is slow, latency grows.

Everything matters.

🎨 Visual Aids

  • Visual 1.1: Side-by-side comparison of Thread-Per-Request vs Event Loop.

  • Visual 1.2: Netty's Boss/Worker threading model architecture.

  • Visual 1.3: Data flow of Zero-Copy (Disk to NIC bypassing JVM heap).


⚙️ 12. The Cost of Blocking Inside an Event Loop

Blocking inside an event loop is catastrophic.

Examples of blocking operations:

  • database calls
  • file I/O
  • network calls to other services
  • long CPU tasks
  • sleep calls
  • synchronous remote APIs
  • slow logging sinks

Why it is dangerous:

  • the event loop cannot service other channels
  • tail latency increases
  • queue depth grows
  • throughput collapses
  • timeouts cascade
  • one connection can freeze thousands

This is one of the most common production mistakes in event-driven systems.

Blocking vs Offloading Diagram
Visual 1.4: Impact of blocking on the loop vs. isolating tasks to dedicated worker pools.

Anti-Pattern Example: EventLoop-1 handles 1,000 connections. Connection A performs a blocking JDBC call taking 5 seconds. Result: All other 999 connections freeze completely for those 5 seconds.

✅ Best Practice: Always offload blocking work to a dedicated Unbounded or Fixed worker pool.


🚨 13. Fairness and Starvation

An event loop must handle work fairly.

If one connection produces too much work, it can monopolize the loop.

Problems include:

  • one client starving others
  • hot channels dominating the ready set
  • uneven latency
  • processing bias
  • unfair wakeup patterns

Good event loop design uses:

  • bounded work per iteration
  • fair dispatching
  • task slicing
  • handoff to workers when needed

🧠 14. Backpressure in Event Loops

Backpressure is essential.

Without it, the event loop can accept more work than it can handle.

Symptoms of missing backpressure:

  • queue growth
  • memory growth
  • buffer buildup
  • increased latency
  • collapse under load

Backpressure mechanisms include:

  • bounded queues
  • limited per-connection work
  • rejected tasks
  • write interest toggling
  • adaptive throttling
  • rate limiting
  • dropping low-priority work

Event loops must stay stable under overload.


⚡ 15. OP_WRITE Is Dangerous

OP_WRITE is often misunderstood.

A socket is frequently writable.

If you keep write interest enabled all the time, the selector may wake continuously even when there is no meaningful work.

This can lead to:

  • busy loops
  • CPU spikes
  • repeated wakeups
  • wasted scheduling cycles
  • self-inflicted overload

Correct strategy:

  • enable write interest only when there is queued outbound data
  • disable it after flushing the buffer

This is one of the most important optimization rules in NIO-based event loops.


🧩 16. Work Slicing

A single loop iteration should not try to do everything.

A production event loop often slices work into smaller units.

Example:

  • accept a connection
  • read a fixed amount of data
  • queue the rest
  • return to the loop

Why this matters:

  • prevents starvation
  • keeps latency predictable
  • improves fairness
  • avoids monopolization by one channel

Large tasks should be chunked and offloaded.


🧠 17. Event Loop State Management

The event loop itself is a stateful machine.

Common states:

State Meaning
Starting Initial setup
Running Normal operation
Draining Finishing queued work
Stopping No new work accepted
Terminated Shutdown complete

Good systems define clear transitions.

Without clear lifecycle control, shutdown becomes messy.


⚙️ 18. Shutdown Design

A proper event loop must stop cleanly.

Graceful shutdown should:

  1. stop accepting new events,
  2. finish or cancel existing work,
  3. close channels safely,
  4. release selector resources,
  5. terminate workers if needed.

A poor shutdown can leave:

  • half-open sockets
  • leaked selectors
  • lingering threads
  • resource leaks
  • inconsistent state

Shutdown is part of design, not an afterthought.

// Logic for Graceful Shutdown
public void shutdown() {
    bossGroup.shutdownGracefully();
    workerGroup.shutdownGracefully();
    // Wait for the loops to finish pending tasks and release selectors
}

🔹 Shutdown Workflow

// Standard Netty Graceful Shutdown
bossGroup.shutdownGracefully();
workerGroup.shutdownGracefully();
// This handles: stopping acceptance, draining tasks, and closing selectors.

🧠 19. Single Reactor vs Multi-Reactor

There are several event-loop topologies.


Single Reactor

One event loop handles all events.

Pros

  • simple
  • easy to understand
  • lower coordination overhead

Cons

  • limited scalability
  • can become a bottleneck
  • poor multi-core utilization

Multi-Reactor

Multiple event loops share the load.

Pros

  • better scaling on multi-core systems
  • higher throughput
  • more isolation
  • lower per-loop pressure

Cons

  • more complex
  • requires better coordination
  • more careful channel assignment

Large systems often use a multi-reactor model.


⚙️ 20. Event Loop Threading Models

There are different threading approaches around event loops.


Model 1: Single Loop, Worker Pool

  • one loop for readiness
  • worker pool for heavy tasks

This is common and practical.


Model 2: One Loop Per Core

  • each loop handles a subset of channels
  • better hardware utilization
  • more complexity
  • strong cache locality

This is common in high-performance frameworks.


Model 3: Event Loop + Specialized Pools

  • I/O loop
  • CPU pool
  • blocking pool
  • scheduled pool

This is often the most production-friendly design.


🧠 21. Reactor vs Proactor

The Reactor pattern is not the only event-based model.


Reactor

  • waits for readiness
  • dispatches when channels are ready
  • common in Java NIO and Netty

Proactor

  • starts asynchronous operations
  • receives completion notifications
  • common in completion-based I/O systems

The architectural difference is subtle but important:

  • Reactor: “tell me when it is ready”
  • Proactor: “tell me when it is done”

🧩 22. Mechanical Sympathy and OS Integration

The event loop achieves mechanical sympathy because it maps well to modern operating system capabilities.

Instead of asking 10,000 sockets:

Are you ready?
Are you ready?
Are you ready?

the loop uses OS-level mechanisms like:

  • epoll on Linux
  • kqueue on macOS / BSD
  • IOCP on Windows in completion-oriented models

This avoids wasting CPU cycles on idle connections.

The OS is asked to do readiness tracking efficiently.


⚙️ 23. Cache Locality and CPU Pinning

Why does Netty often keep a specific connection tied to the same event loop thread?

Because of CPU cache locality.

If a connection is processed by the same thread repeatedly:

  • data stays warm in L1/L2 caches
  • less cache invalidation
  • less memory traffic
  • better branch prediction
  • lower latency

If a connection bounces between threads:

  • caches are constantly invalidated
  • memory access becomes slower
  • overhead rises sharply

Stable thread-to-connection affinity is often a major performance win.

Zero-Copy and Cache Locality Diagram
Visual 1.5: Zero-Copy flow moving data from disk to NIC bypassing the JVM heap.

⚡ Advanced Throughput: Zero-Copy & Batching

To push performance to the absolute limit, the Event Loop utilizes:

  • Zero-Copy I/O: Using FileChannel.transferTo(), data moves directly from disk to the network buffer without ever entering the JVM Heap. This saves CPU cycles and memory bandwidth.
  • Event Batching: Instead of waking up for every single packet, the loop can gather multiple ready events in one poll() call, drastically reducing the cost of system calls (syscalls).

⚠️ 24. Common Event Loop Mistakes


❌ Doing blocking I/O in the loop

This is the most dangerous mistake.


❌ Doing too much work per event

This increases tail latency and harms fairness.


❌ Forgetting to remove selected keys

This leads to repeated handling and busy loops.


❌ Enabling OP_WRITE permanently

This can cause endless wakeups.


❌ Using unbounded queues for handoff

This leads to hidden overload and memory growth.


❌ Not separating protocol parsing from business logic

This makes the loop fragile and hard to scale.


❌ Having a single god-loop for the whole system

This wastes available cores and creates a bottleneck.


📊 25. Event Loop Metrics to Watch

A production event loop should be observable.

Important metrics:

Metric Meaning
Loop Iteration Time How long each cycle takes
Ready Key Count How many events are detected
Dispatch Time How long handling takes
Queue Depth How much work is waiting
Rejection Count How often overload happens
Wakeup Count How often the loop is interrupted
Tail Latency Worst-case response behavior
Busy Loop Rate Indicator of accidental spinning
Idle Time Whether the loop is underutilized or sleeping appropriately

Metrics reveal whether the loop is healthy.


🧵 26. The Modern Era: Event Loops vs. Virtual Threads (Project Loom)

Java 21 introduced Virtual Threads, changing the landscape. How do they compare?

  • Event Loops: Still the gold standard for Network Proxies, Gateways, and Message Brokers where maximum throughput and fine-grained control over I/O are required.
  • Virtual Threads: The best choice for Standard CRUD/Business APIs. They allow you to write simple, blocking code that scales like an Event Loop.

The Hybrid Rule: Use Event Loops (Netty) for your infrastructure/networking layer and consider Virtual Threads for your heavy business logic layer.


💡 27. Real-World Case Study: The 100k WebSockets Collapse

Scenario

A chat application built on Spring Boot with a traditional thread-per-request server model needed to support 100,000 concurrent WebSockets.

The Crash

At around 8,000 users, the JVM threw:

OutOfMemoryError: unable to create new native thread

The system was using huge amounts of memory just to keep idle WebSocket threads alive.

The Fix

The team migrated to an event-driven architecture using Netty via a reactive stack.

Result

  • memory usage dropped dramatically
  • context switching overhead disappeared
  • throughput stabilized
  • the system could handle far more idle long-lived connections
  • the architecture became suitable for websocket-style workloads

Lesson

For long-lived, mostly idle connections like:

  • WebSockets
  • SSE
  • chat sessions
  • real-time dashboards

event loops are often the correct architectural choice.


🧠 28. Event Loop Design Principles

A strong event loop design usually follows these rules:

  • keep the loop lightweight
  • never block in the loop
  • bound work per iteration
  • hand off expensive work
  • use backpressure
  • make shutdown explicit
  • monitor queue growth
  • prioritize fairness
  • separate I/O from business logic
  • keep per-connection state minimal
  • preserve cache locality
  • avoid unnecessary wakeups
  • use specialized pools for blocking work

These rules are what make event-driven systems stable.


🛑 29. Event Loop Anti-Patterns Checklist

If your event-driven system is slow, check for these fatal mistakes:

  • Hidden blocking: using InputStream, URLConnection, JDBC, or other blocking APIs inside a reactive chain.
  • Synchronous logging: writing logs to a slow sink inside the event loop.
  • God threads: having one loop do everything instead of using available cores.
  • Lack of backpressure: accepting more work than downstream systems can process.
  • Unbounded handoff queues: hiding overload until memory fails.
  • Mixing business logic with readiness logic.
  • Enabling write readiness permanently.
  • Doing large CPU work inline.
  • Ignoring per-connection fairness.

🛠️ 30. Diagnostic Tools & Flags

If your Event Loop is struggling, use these professional diagnostic tools:

Tool/Flag Purpose Key Metric
-Dio.netty.eventLoopThreads=N Manual Thread Tuning Compare throughput vs. core count.
jcmd <pid> Thread.print Thread Dump Look for BLOCKED states in nioEventLoop threads.
async-profiler CPU Profiling Check for "Selector Spinning" or high select() time.
JFR (Flight Recorder) Latency Analysis Look for "Socket Read" events exceeding your p99 targets.

Event Loop Profiling Tools
Visual 1.6: Visualizing bottlenecks using async-profiler flame graphs and JFR timelines.


🎨 Visual Aids (Recommendations for your Wiki)

  1. Visual 1.1: A side-by-side of 1,000 threads (heavy) vs. 1 Event Loop (light).
  2. Visual 1.2: A diagram showing "Boss" passing a key to a "Worker" queue.
  3. Visual 1.3: A "Zero-Copy" flow: Disk → Kernel Buffer → NIC (bypassing User Space).

🚀 31. Real-World Relevance

Event loops are foundational in:

  • Netty
  • Vert.x
  • reactive servers
  • websocket gateways
  • message brokers
  • low-latency trading systems
  • API gateways
  • proxy servers
  • streaming systems
  • high-concurrency microservices

If the system has many concurrent connections and a small number of active workers, event loops are often the right tool.


🔗 32. Related Deep Dives

Continue exploring:


💬 Final Thought

An event loop is not just a programming construct.

It is an architectural boundary.

It decides:

  • what happens immediately
  • what gets deferred
  • what gets handed off
  • what gets rejected
  • what gets delayed
  • what gets protected from overload
  • what gets kept hot in cache
  • what gets isolated into workers

The best engineers do not just write loops.

They design control systems that keep the system fast, fair, and stable under load.


Clone this wiki locally