Skip to content

DMR/openai_adapter sends no repetition penalty — Linux/CUDA personas verbatim-echo each other #958

@joelteply

Description

@joelteply

Bug

src/workers/continuum-core/src/ai/openai_adapter.rs:619-625 — the DMR POST body sent to local DMR (Docker Model Runner) only includes model, messages, temperature, max_tokens, stream. No frequency_penalty or presence_penalty. DMR therefore runs with OpenAI's defaults of 0.0, meaning ZERO repetition pressure during sampling.

Meanwhile llamacpp_adapter.rs:391 + llamacpp_scheduler.rs:648 set sampling.repeat_penalty = 1.1 for the in-process Mac path. So the platforms diverge: Mac in-process has a mild penalty, Linux DMR has none.

Symptom: Linux/CUDA chat shows personas verbatim-echoing each other's prior responses — e.g. multiple personas in turn outputting the same "Sentinel: dev/build-feature ... please continue to support the team and provide any necessary updates" block. Conversation degrades into a self-reinforcing echo loop within 3-4 turns.

Repro

# Linux/CUDA Carl
./jtag collaboration/chat/send --room=general --message='introduce yourselves'
./jtag collaboration/chat/export --room=General --limit=20

Look for verbatim repetition of multi-paragraph text across personas.

Fix shape

Add to openai_adapter.rs:619 body construction:

"frequency_penalty": 0.4,
"presence_penalty": 0.4,

Defaults chosen to match the perceptual mildness of llamacpp's repeat_penalty=1.1 — strong enough to break verbatim repetition without forcing weird vocabulary.

Should also evaluate whether to bump llamacpp_adapter's default to 1.15 if echo loops persist on Mac after the Linux gap closes.

Linked

PR #950 — coherent personas merge gate (RULE 1: Mac vs Linux paths must converge in behavior).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpr-950-blockerBlocking PR #950 merge

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions