feat(rust): steering, capture, and named steering modules in the rust frontend#187
Open
RhizoNymph wants to merge 3 commits into
Open
feat(rust): steering, capture, and named steering modules in the rust frontend#187RhizoNymph wants to merge 3 commits into
RhizoNymph wants to merge 3 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds per-request activation steering and end-to-end activation capture to the Rust frontend, plus named steering modules (startup load + runtime management), across both inbound surfaces (OpenAI HTTP
/v1/completions+/v1/chat/completions, and the served gRPCGenerateservice).Per-request steering
steering_vectors/prefill_steering_vectors/decode_steering_vectors(packed base64 wire format matching the Python OpenAI API),steering_name.dtype/shape/layer_indices/data/scales) decoded into the inlineSteeringVectorSpecengine-core resolves (float32/float16/bfloat16/float64, per-row scales). Layer keys serialize as integer msgpack keys (Pythondict[int, ...]).steering_namelowers tosteering_module_ref = (name, 1.0).Named steering modules
--steering-modules name=pathloads module JSON at startup and broadcasts to the engine workers (register_steering_modules+pre_materialize_steering_module).GET/POST/DELETE /v1/steering/modules. Per-requeststeering_namevalidated up front (unknown →invalid_request/NotFound); mutations serialized by a lock.Activation capture (end-to-end)
capturespec accepted on HTTP + gRPC, forwarded verbatim intoSamplingParams.capture; engine-core's offline admission resolves prefix-cache flags.EngineCoreOutputis anarray_liketuple withcapture_resultsat index 7; the RustEngineCoreOutputpredated that field, so capture-enabled outputs misaligned/failed to decode. Added aCaptureResulttype and insertedcapture_resultsat the correct tuple index, threaded it (alongsidekv_transfer_params) throughllm→text→chat, and surfaced it on the non-streaming completion/chat responses and the gRPCFinishInfo— coercing each consumer payload to an object (mirror of Python's_capture_result_to_response_payload). Streaming responses omitcapture_results, matching Python (capture still executes; consumers write out of band).python_compat.pyfixture mirror was updated to includecapture_results, keeping the Rust↔Python wire-format test faithful; a new wire test pins the index-7 position.Why
The Rust frontend replaces vLLM's Python OpenAI entrypoint; these are the steering/capture surfaces clients rely on. Per-request resolution/admission stays in the engine-core; the frontend decodes the wire format, forwards, manages the named-module registry, and surfaces capture results.
Notes / deferred
_admit_capturespec validation is deferred to the engine-core here, so a malformed capture spec surfaces as an engine error rather than a400up front.See
rust/docs/features/steering_capture.mdfor the full data flow (incl. the capture return path), file map, and invariants.