Skip to content

Releases: alityb/hotpath

hotpath

07 Apr 18:47

Choose a tag to compare

hotpath

06 Apr 07:23

Choose a tag to compare

frontend work

hotpath

06 Apr 02:09

Choose a tag to compare

KV Cache fix

hotpath

06 Apr 01:22

Choose a tag to compare

Fix log detection

hotpath 0.3.1

06 Apr 00:28

Choose a tag to compare

hotpath 0.3.1 is a small cleanup patch on top of 0.3.0.

Highlights:

  • metrics-only serve-profile runs now say explicitly that no traffic file was provided and that per-request queue, prefill, and decode timing require requests during the capture window
  • when a server log is found but a run observed zero requests, hotpath now reports that no requests were observed during the profiling run instead of blaming the vLLM log format
  • logs.md was removed from tracked release contents and is local-only again

Focused verification run before release:

  • cmake --build build --parallel && ctest --test-dir build -R 'test_serve_profiler|test_serve_report|test_cli|test_audit' --output-on-failure
  • hotpath version -> hotpath 0.3.1

hotpath 0.3.0

06 Apr 00:18

Choose a tag to compare

hotpath 0.3.0 focuses on serving-timing correctness and the local profiling flow.

Highlights:

  • serve-profile now honors configured concurrency and stops dispatching new requests when the requested duration window closes.
  • client TTFT is now measured from the first streamed token chunk instead of HTTP first-byte timing.
  • local vLLM DEBUG log autodiscovery now follows the real live listener stdout/stderr files instead of stale .hotpath/video-server/* logs.
  • vLLM v1 parsing now merges exact request IDs from Added request ... lines with anonymous Running batch ... BatchDescriptor(...) lines.
  • vLLM v1 server queue, prefill, and decode timings are now refined from Prometheus when the raw DEBUG timestamps are only second-resolution, and the report explicitly discloses that.
  • serve-report no longer renders missing timing as fake 0.0 ms values.
  • the live serve-profile dashboard redraw path was hardened for Ghostty-style terminals so updates happen in place instead of stacking duplicate frames.
  • logs.md is now included in the repo with a detailed UTC log of this pass.

Focused verification run before release:

  • cmake --build build --parallel
  • ctest --test-dir build -R 'test_log_parser|test_serve_profiler|test_serve_report|test_traffic_replayer|test_cli|test_audit' --output-on-failure
  • live local smoke against Qwen/Qwen3.5-4B on localhost:8000

v0.2.9

05 Apr 23:14

Choose a tag to compare

serve-profile timing correlation and startup hardening.

Highlights:

  • inject stable hotpath request IDs and canonicalize vLLM v1 randomized internal IDs back to those external IDs
  • parse only the current run log tail so old requests do not contaminate timing correlation
  • wait through a short external endpoint startup grace window before failing localhost demo runs
  • start the Qwen demo server with --enable-log-requests and fail fast if port 8000 is already owned by another process

hotpath v0.2.8

05 Apr 23:01

Choose a tag to compare

serve-profile terminal redraw fix.

  • replaced the live serve-profile dashboard DEC cursor save and restore path with explicit cursor-up redraws
  • fixes duplicated dashboard frames in Ghostty and similar terminals during live profiling
  • keeps the existing live dashboard behavior while using a more predictable ANSI redraw path

hotpath v0.2.7

05 Apr 22:51

Choose a tag to compare

vLLM v1 timing correctness and reporting fixes.

  • serve-profile now uses vLLM 0.19 queue, prefill, and decode histograms to refine ORDER-matched v1 logs
  • reused --output directories now start from a fresh serve_profile.db, so old request traces do not leak into new runs
  • serve-report now shows sub-millisecond latency values when needed, so tiny queue waits like 0.013 ms do not round down to 0.0
  • fixed server_timing_match_method persistence and updated the report note for Prometheus-assisted v1 timing
  • expanded regression coverage for vLLM 0.19 histogram parsing and serve-report formatting

v0.2.6

05 Apr 22:08

Choose a tag to compare

serve-profile UX fixes for the local video flow.

Highlights:

  • auto-discover a local vLLM debug log under .hotpath/video-server/ when --server-log is not passed
  • prompt for Concurrency in the interactive serve-profile flow
  • forward the auto-discovered log path from the interactive UI so queue, prefill, and decode timing work in the common localhost demo setup
  • add a regression test for local server log autodiscovery