Skip to content

Server-side histogram binning + grid/sort perf fixes#221

Open
AlexGrayBox wants to merge 2 commits into
v1.2.3---UI-Optimization-and-new-featuresfrom
histogram-server-binning
Open

Server-side histogram binning + grid/sort perf fixes#221
AlexGrayBox wants to merge 2 commits into
v1.2.3---UI-Optimization-and-new-featuresfrom
histogram-server-binning

Conversation

@AlexGrayBox

Copy link
Copy Markdown
Member

What

Moves histogram binning to the server behind a typed GetHistogram RPC, and fixes grid reader/sort stalls during active training.

Histogram (server-side binning)

  • New GetHistogram RPC + typed HistogramBin/HistogramSubBar/HistogramRequest/HistogramResponse proto messages (regenerated _pb2/_pb2_grpc).
  • DataService.GetHistogram: equal-population positional binning over the view df; returns <= max_bins typed bins with per-bin min/max/avg/count + origin/discarded sub-bars.
  • Bit-identical to the previous client-side binning (cross-check ~1e-7, float32 ε); ~24KB payload vs MBs; ~59x faster than the client fallback.

Grid / sort perf

  • ApplyDataQuery sort fast-path: sort-only queries skip the forced full _slowUpdateInternals rebuild (~7.5–15.5s → ~0.5s under training).
  • Non-blocking background view refresh (_bg_view_refresh on a WL-ViewRefresh thread + _refresh_in_flight lock): readers no longer block on the 2–4s rebuild (p95 ~3000ms → ~130ms).

Example config

  • Loosened eval/dump cadence and bumped test batch size so the MNIST example isn't eval-bound.

Notes

  • The UI counterpart is GrayboxTech/weights_studio# (server path primary, client binning kept as fallback).
  • Lint job only checks changed .py; data_service.py/trainer_services.py are excluded by the CI path filter, test_grpc_user_actions.py (the one lintable change) is clean.

🤖 Generated with Claude Code

- GetHistogram RPC: bin one column server-side into <=512 typed bins
  (min/max/avg/count + per-(origin,discarded) sub-bars) instead of the client
  pulling every row and binning in the browser. Bit-identical to the client
  binning; ~116x smaller payload, ~50ms warm. Adds the proto messages + RPC,
  regenerated pb2/pb2_grpc, DataService.GetHistogram, and servicer delegation.
- ApplyDataQuery: skip the forced full-view rebuild for SORT-ONLY operations
  (a sort just re-orders the existing snapshot). Global sort ~7.5s -> ~0.5s.
- _slowUpdateInternals: run the view rebuild on a background thread for
  reader-triggered (non-force) refreshes, so grid/histogram reads never block
  on the multi-second collapse+combine. Reader p95 ~3000ms -> ~130ms. Filters/
  resets still refresh inline (need fresh data).
- ws-classification example: loosen eval (100->500) / checkpoint (25->250)
  cadence and use a bigger eval batch (16->128) so eval stops dominating
  wall-clock.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@AlexGrayBox AlexGrayBox force-pushed the histogram-server-binning branch from 59a5776 to 59318e9 Compare June 20, 2026 15:42
@guillaume-byte guillaume-byte changed the base branch from dev to v1.2.3---UI-Optimization-and-new-features June 22, 2026 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants