Releases: alityb/hotpath
Releases · alityb/hotpath
hotpath
hotpath
hotpath
hotpath
hotpath 0.3.1
hotpath 0.3.1 is a small cleanup patch on top of 0.3.0.
Highlights:
- metrics-only
serve-profileruns now say explicitly that no traffic file was provided and that per-request queue, prefill, and decode timing require requests during the capture window - when a server log is found but a run observed zero requests, hotpath now reports that no requests were observed during the profiling run instead of blaming the vLLM log format
logs.mdwas removed from tracked release contents and is local-only again
Focused verification run before release:
cmake --build build --parallel && ctest --test-dir build -R 'test_serve_profiler|test_serve_report|test_cli|test_audit' --output-on-failurehotpath version->hotpath 0.3.1
hotpath 0.3.0
hotpath 0.3.0 focuses on serving-timing correctness and the local profiling flow.
Highlights:
serve-profilenow honors configured concurrency and stops dispatching new requests when the requested duration window closes.- client TTFT is now measured from the first streamed token chunk instead of HTTP first-byte timing.
- local vLLM DEBUG log autodiscovery now follows the real live listener stdout/stderr files instead of stale
.hotpath/video-server/*logs. - vLLM v1 parsing now merges exact request IDs from
Added request ...lines with anonymousRunning batch ... BatchDescriptor(...)lines. - vLLM v1 server queue, prefill, and decode timings are now refined from Prometheus when the raw DEBUG timestamps are only second-resolution, and the report explicitly discloses that.
serve-reportno longer renders missing timing as fake0.0 msvalues.- the live
serve-profiledashboard redraw path was hardened for Ghostty-style terminals so updates happen in place instead of stacking duplicate frames. logs.mdis now included in the repo with a detailed UTC log of this pass.
Focused verification run before release:
cmake --build build --parallelctest --test-dir build -R 'test_log_parser|test_serve_profiler|test_serve_report|test_traffic_replayer|test_cli|test_audit' --output-on-failure- live local smoke against
Qwen/Qwen3.5-4Bonlocalhost:8000
v0.2.9
serve-profile timing correlation and startup hardening.
Highlights:
- inject stable hotpath request IDs and canonicalize vLLM v1 randomized internal IDs back to those external IDs
- parse only the current run log tail so old requests do not contaminate timing correlation
- wait through a short external endpoint startup grace window before failing localhost demo runs
- start the Qwen demo server with --enable-log-requests and fail fast if port 8000 is already owned by another process
hotpath v0.2.8
serve-profile terminal redraw fix.
- replaced the live serve-profile dashboard DEC cursor save and restore path with explicit cursor-up redraws
- fixes duplicated dashboard frames in Ghostty and similar terminals during live profiling
- keeps the existing live dashboard behavior while using a more predictable ANSI redraw path
hotpath v0.2.7
vLLM v1 timing correctness and reporting fixes.
- serve-profile now uses vLLM 0.19 queue, prefill, and decode histograms to refine ORDER-matched v1 logs
- reused --output directories now start from a fresh serve_profile.db, so old request traces do not leak into new runs
- serve-report now shows sub-millisecond latency values when needed, so tiny queue waits like 0.013 ms do not round down to 0.0
- fixed server_timing_match_method persistence and updated the report note for Prometheus-assisted v1 timing
- expanded regression coverage for vLLM 0.19 histogram parsing and serve-report formatting
v0.2.6
serve-profile UX fixes for the local video flow.
Highlights:
- auto-discover a local vLLM debug log under .hotpath/video-server/ when --server-log is not passed
- prompt for Concurrency in the interactive serve-profile flow
- forward the auto-discovered log path from the interactive UI so queue, prefill, and decode timing work in the common localhost demo setup
- add a regression test for local server log autodiscovery