Add Presence Penalty support to `create_generator` by AirRunner · Pull Request #319 · lmstudio-ai/mlx-engine

AirRunner · 2026-04-25T22:06:29Z

mlx_lm already implements make_presence_penalty in sample_utils.py, but mlx_engine did not expose it through create_generator. This makes it impossible for callers to apply a presence penalty, which is useful for reducing token repetition in long generations.

Changes

Added presence_penalty and presence_context_size to create_generator, _sequential_generation, and _batched_generation, following the existing repetition_penalty / repetition_context_size interface. Both default to None.

Refactor: `TokenPenaltyProcessor`

The existing RepetitionPenaltyProcessor was a custom wrapper whose sole purpose was to prepend cached prefix tokens to the penalty window (since logits processors only receive tokens generated in the current turn, not those already in the KV cache). This logic is now extracted into a generic TokenPenaltyProcessor that works with any mlx_lm penalty function.

Testing

Added test_presence_penalty_applies mirroring the existing test_repetition_penalty_applies.

This PR exposes presence_penalty at the Python level. To make it available in the LM Studio UI, the Node bridge needs to map presencePenalty to presence_penalty in the kwargs passed to create_generator (analogous to how llm.prediction.llama.presencePenalty is already handled for the llama.cpp backend).

See lmstudio-bug-tracker#1604, lmstudio-bug-tracker#1842.

Adds presence_penalty and presence_context_size parameters to create_generator, _sequential_generation, and _batched_generation, mirroring the existing repetition_penalty interface. Internally refactors RepetitionPenaltyProcessor into a generic TokenPenaltyProcessor that handles KV cache awareness for any mlx_lm penalty function (repetition, presence, frequency). This removes the now-redundant RepetitionPenaltyProcessor class and setup_repetition_penalty helper. The Node bridge in LM Studio needs to map presencePenalty to presence_penalty in kwargs to expose this via the UI.

github-actions · 2026-04-25T22:06:41Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

AirRunner · 2026-04-25T22:08:43Z

I have read the CLA Document and I hereby sign the CLA

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbcfc3295d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T22:09:39Z

+    if repetition_penalty and repetition_penalty != 0.0:
+        logits_processors.append(
+            TokenPenaltyProcessor(
+                make_repetition_penalty(repetition_penalty, repetition_context_size or 20),


Preserve explicit zero context size values

Using repetition_context_size or 20 silently rewrites an explicit 0 to 20, so callers cannot pass 0 through to mlx_lm anymore (the same pattern is repeated for presence_context_size). This is a behavioral regression from the previous implementation, where provided values were forwarded verbatim, and it can change generation outputs in experiments that intentionally set context size to zero. Please switch to an explicit None check (e.g. 20 if repetition_context_size is None else repetition_context_size) so only missing values default.

Useful? React with 👍 / 👎.

Fixed in 655d788 using an explicit is not None check.

(In practice a context size of 0 makes no semantic sense for penalty, but the explicit check is still cleaner).

reneleonhardt · 2026-04-28T07:22:30Z

Thank you very much for your work, everyone is waiting for presence_penalty to enjoy the amazing agentic coding results of Qwen 3.6 27b!

Is there a human here to start the workflows and review?

https://unsloth.ai/docs/models/qwen3.6#qwen3.6-27b
https://www.reddit.com/r/LocalLLaMA/comments/1strodp/qwen_36_27b_makes_huge_gains_in_agency_on/

AirRunner · 2026-04-29T23:48:19Z

Is there a human here?

@reneleonhardt haha what times we live in 😅

github-actions Bot added the CLA signed Indicates that all contributors have signed label Apr 25, 2026

chatgpt-codex-connector Bot reviewed Apr 25, 2026

View reviewed changes

AirRunner mentioned this pull request Apr 25, 2026

presence_penalty parameter missing from inference parameters panel (required for Qwen 3.6) lmstudio-ai/lmstudio-bug-tracker#1842

Open

fix: preserve explicit zero context_size values

655d788

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Presence Penalty support to `create_generator`#319

Add Presence Penalty support to `create_generator`#319
AirRunner wants to merge 2 commits into
lmstudio-ai:mainfrom
AirRunner:feat/presence-penalty

AirRunner commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026 •

edited

Loading

Uh oh!

AirRunner commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Uh oh!

AirRunner Apr 25, 2026

Uh oh!

reneleonhardt commented Apr 28, 2026

Uh oh!

AirRunner commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AirRunner commented Apr 25, 2026

Changes

Refactor: TokenPenaltyProcessor

Testing

Uh oh!

github-actions Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AirRunner commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

AirRunner Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

reneleonhardt commented Apr 28, 2026

Uh oh!

AirRunner commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refactor: `TokenPenaltyProcessor`

github-actions Bot commented Apr 25, 2026 •

edited

Loading