Add a min-requests constraint by sjmonson · Pull Request #700 · vllm-project/guidellm

sjmonson · 2026-04-20T20:49:58Z

Summary

Adds a --min-requests constraint that acts like --max-requests but keeps scheduling requests until the last request under the threshold completes.

Details

The normal --max-requests variant can have unexpectedly low per-request throughput / latency due to request trail-off at the end of the benchmark. The current solution to this problem is to set --max-requests so high that the proportion of trail-off time to total benchmark time is small. If we continue to schedule requests even after hitting the constraint we ensure that the requested rate is maintained for the entire duration of measurement.

Note that --min-requests is a bit of a misnomer when combined with other constraints, since any other active constraints can trigger the benchmark to end before --min-requests. Other name suggestions are welcome.

Test Plan

Here is an example benchmark which should cause --min-requests to behave differently from --max-requests:

guidellm benchmark run \
    --target http://127.0.0.1:8000 \
    --request-format /v1/completions \
    --profile concurrent \
    --rate 30 \
    --data prompt_tokens=50,output_tokens=50 \
    --min-requests 50

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: Samuel Monson <smonson@redhat.com>

jaredoconnell

I'm not 100% sure of the arg name. I can see arguments both ways.

But I think this would be a good opportunity to add a markdown file detailing all of the constraints.

I added two comments.

jaredoconnell · 2026-04-20T22:01:03Z

+    """
+    Constraint that limits execution based on minimum request counts.
+
+    Like MinNumberConstraint but instead of stopping request generation after reaching


Mistake: It should say "Like MaxNumberConstraint"

I think this wording doesn't emphasize the nuances of this implementation enough. Maybe clarify generation and processing, and why this may be helpful. It's identical except that it doesn't stop queueing until the max processed quantity is reached.

jaredoconnell · 2026-04-20T22:09:21Z



+@ConstraintsInitializerFactory.register(  # type: ignore[arg-type]
+    ["min_number", "min_num", "min_requests", "min_req"]


It may make sense to instead rename this to max-processed. I think this would be less confusing. But I can see the argument for min, since it's going to keep scheduling past that until the max-processed is reached. So I'm not sure what should be done.

Yeah... max-processed is both a little too vague and also incorrect since we can end up processing more requests then set. I think min is fine actually. I'll just add some notes to the docs that clarify constraints are OR not AND. Maybe in the future we can support AND constraint combinations.

sjmonson added 3 commits April 20, 2026 16:20

Add min requests constraint

5948c17

Signed-off-by: Samuel Monson <smonson@redhat.com>

Add unit tests

e265432

Signed-off-by: Samuel Monson <smonson@redhat.com>

Split constraint tests into correct paths

fdbf50b

Signed-off-by: Samuel Monson <smonson@redhat.com>

sjmonson requested a review from jaredoconnell April 20, 2026 20:50

sjmonson self-assigned this Apr 20, 2026

sjmonson added priority-low feature Represents a new user-visible feature labels Apr 20, 2026

jaredoconnell reviewed Apr 20, 2026

View reviewed changes

sjmonson marked this pull request as draft April 22, 2026 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a min-requests constraint#700

Add a min-requests constraint#700
sjmonson wants to merge 3 commits intomainfrom
feat/min_constraints

sjmonson commented Apr 20, 2026

Uh oh!

jaredoconnell left a comment

Uh oh!

jaredoconnell Apr 20, 2026

Uh oh!

jaredoconnell Apr 20, 2026

Uh oh!

sjmonson Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		@ConstraintsInitializerFactory.register( # type: ignore[arg-type]
		["min_number", "min_num", "min_requests", "min_req"]

Conversation

sjmonson commented Apr 20, 2026

Summary

Details

Test Plan

Use of AI

Uh oh!

jaredoconnell left a comment

Choose a reason for hiding this comment

Uh oh!

jaredoconnell Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

jaredoconnell Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

sjmonson Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants