Complete documentation for the LLM Auto-Optimizer Stream Processor component.
The Stream Processor is an enterprise-grade event processing engine that transforms raw feedback events from Kafka into aggregated metrics for the LLM Auto-Optimizer. It provides real-time windowing, aggregation, and state management capabilities.
- Multi-Window Support: Tumbling, sliding, and session windows
- Rich Aggregations: Count, sum, average, min, max, percentiles (p50, p95, p99), standard deviation
- Watermarking: Handle late events with configurable allowed lateness
- State Persistence: RocksDB-backed state with checkpoint/recovery
- High Performance: 10,000+ events/sec throughput, <100ms p99 latency
- Fault Tolerance: Automatic recovery from failures
File: stream-processor-architecture.md
Comprehensive technical design covering:
- System architecture and components
- Module structure and file organization
- Core data structures (Window, Aggregator, State)
- Processing pipeline flow
- Watermarking and late event handling
- State management and persistence
- Error handling strategy
- Configuration structure
- API design
When to read: Start here for a complete understanding of the system architecture.
File: stream-processor-implementation-plan.md
Detailed 11-week implementation roadmap:
- Phase-by-phase development plan
- Task breakdown with checklists
- Dependencies and prerequisites
- Success metrics and KPIs
- Risk mitigation strategies
- Deployment strategy
- Monitoring and alerting setup
When to read: Use this as a project management guide for implementation.
File: stream-processor-api-reference.md
Quick reference guide with code examples:
- Quick start examples
- Window type usage
- Aggregation function APIs
- Watermarking strategies
- State management operations
- Kafka integration
- Pipeline building
- Error handling
- Configuration examples
- Performance tuning
- Troubleshooting guide
When to read: Reference this while coding or integrating with the stream processor.
File: stream-processor-diagrams.md
Visual representations of:
- System context and integration
- Processing pipeline flow
- Window type visualizations
- Watermark and late event handling
- State management architecture
- Checkpoint and recovery flow
- Aggregation engine details
- Parallelism and partitioning
- Error handling decision tree
- Deployment architecture
- Metrics and monitoring
When to read: Use for visual understanding and presentations.
Start with the Architecture Design to understand:
- How events flow through the system
- What data structures are used
- How windows and aggregations work
Check the Visual Diagrams to see:
- How components interact
- How data flows through the pipeline
- How windows work visually
Look at the API Reference for:
- Code examples
- Configuration options
- Common usage patterns
Use the Implementation Plan to:
- Understand development phases
- Track progress
- Identify dependencies
Time-based groupings of events for aggregation:
- Tumbling: Fixed-size, non-overlapping (e.g., hourly aggregations)
- Sliding: Fixed-size, overlapping (e.g., moving averages)
- Session: Dynamic, gap-based (e.g., user sessions)
Statistical functions computed over windows:
- Basic: Count, sum, average, min, max
- Advanced: Percentiles (p50, p95, p99), standard deviation
Mechanism for handling out-of-order events:
- Track progress in event time
- Allow late events within configured lateness
- Trigger window computations
Persistent storage of window accumulators:
- In-memory (development)
- RocksDB (production)
- Checkpointing for fault tolerance
Events (Kafka)
│
▼
┌───────────────────────────────────────┐
│ Stream Processor │
│ │
│ Watermark → Windows → Aggregation │
│ ↕ │
│ State (RocksDB) │
└───────────────────────────────────────┘
│
▼
Aggregated Metrics
│
├─→ Analyzer Engine
├─→ Decision Engine
└─→ Dashboard
The processor is organized into these modules:
crates/processor/src/
├── core/ # Event wrappers and core types
├── window/ # Window assignment and triggers
├── aggregation/ # Aggregation functions
├── watermark/ # Watermark generation
├── state/ # State persistence
├── pipeline/ # Processing pipeline
├── kafka/ # Kafka integration
├── metrics/ # Observability
├── error.rs # Error types
└── config.rs # Configuration
processor:
parallelism: 4
buffer_size: 10000
checkpoint_interval_secs: 60
windows:
default_type: "tumbling"
tumbling:
size_secs: 300 # 5 minutes
allowed_lateness_secs: 60
watermark:
strategy: "bounded_out_of_orderness"
max_out_of_orderness_secs: 30
aggregation:
enabled: ["count", "sum", "average", "min", "max", "percentile", "stddev"]
percentiles: [50, 95, 99]
kafka:
consumer:
bootstrap_servers: "localhost:9092"
group_id: "processor-group"
topics: ["feedback-events"]
producer:
result_topic: "aggregated-metrics"use llm_optimizer_processor::*;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create pipeline
let pipeline = create_pipeline::<FeedbackEvent>()
.with_kafka_source(kafka_config)
.with_watermark_generator(
BoundedOutOfOrdernessWatermark::new(Duration::from_secs(30))
)
.with_tumbling_window(Duration::from_secs(300))
.with_state_backend(state_backend)
.with_sink(kafka_sink)
.build()?;
// Run
pipeline.run().await?;
Ok(())
}| Metric | Target | Notes |
|---|---|---|
| Throughput | 10,000+ events/sec | Per processor instance |
| Latency (p99) | <100ms | End-to-end processing |
| State Recovery | <30 seconds | From checkpoint |
| Memory Usage | <2GB | For 100k active windows |
| Availability | 99.9% | With proper deployment |
- Phase 1-2 (Weeks 1-3): Core foundation and window management
- Phase 3-4 (Weeks 3-5): Watermarking and aggregation engine
- Phase 5-6 (Weeks 5-7): State management and Kafka integration
- Phase 7-8 (Weeks 7-8): Pipeline and observability
- Phase 9-10 (Weeks 9-11): Testing and production hardening
Total: 11 weeks (2.5 months)
- Core types and utilities
- Window assignment logic
- Aggregation functions
- Watermark generation
- End-to-end processing
- Late event handling
- State recovery
- Multi-window scenarios
- Throughput benchmarks
- Latency benchmarks
- Memory profiling
- State backend performance
cargo run --bin processor -- --config config/processor-dev.yamlkubectl apply -f k8s/processor-deployment.yaml
kubectl apply -f k8s/processor-service.yaml
kubectl apply -f k8s/processor-pvc.yaml- Events processed per second
- Processing latency (p50, p95, p99)
- Kafka consumer lag
- State size (MB)
- Checkpoint duration
- Error rate
- Real-time processing metrics
- Window statistics
- State size trends
- Error tracking
- Consumer lag > 10,000 messages
- p99 latency > 1 second
- Error rate > 1%
- State size > 10GB
- Increase
parallelismin configuration - Increase
max_poll_recordsfor Kafka consumer - Scale horizontally (add more instances)
- Reduce
allowed_lateness_secs - More frequent checkpoints
- Review window sizes
- Increase
allowed_lateness_secs - Increase
max_out_of_orderness_secs - Review event timestamp accuracy
See the Implementation Plan for:
- Current development status
- Open tasks and issues
- How to contribute
- Architecture Design - Complete technical design
- Implementation Plan - Development roadmap
- API Reference - Code examples and usage
- Visual Diagrams - Architecture diagrams
- Apache Kafka
- RocksDB
- Apache Flink Concepts - Similar streaming concepts
- Dataflow Model - Streaming systems theory
Q: What's the difference between tumbling and sliding windows? A: Tumbling windows are non-overlapping (each event belongs to one window), while sliding windows overlap (each event can belong to multiple windows). Use tumbling for simple periodic aggregations, sliding for moving averages.
Q: How does watermarking work?
A: Watermarks track progress in event time. They're calculated as max_event_time - max_out_of_orderness. Events arriving before the watermark minus allowed lateness are dropped.
Q: Can I use multiple aggregation functions?
A: Yes! The AggregationState computes all enabled aggregations simultaneously. Configure which ones to enable in the config file.
Q: How do I scale the processor? A: Scale horizontally by adding more processor instances. Each instance consumes from different Kafka partitions, so parallelism is limited by the number of partitions.
Q: What happens if the processor crashes? A: The processor automatically recovers from the last checkpoint. It restores state from RocksDB and resumes Kafka consumption from the checkpointed offsets.
Q: How do I monitor the processor?
A: The processor exposes Prometheus metrics at /metrics. Use Grafana dashboards to visualize metrics and set up alerts in Alertmanager.
For questions or issues:
- Check this documentation
- Review the API Reference
- Check the Troubleshooting section
- Open an issue in the repository
Next Steps: Start with the Architecture Design to understand the complete system.