-
Notifications
You must be signed in to change notification settings - Fork 75
Pull requests: openinfer-project/openinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(qwen35): load the untied lm_head instead of reusing embed_tokens
#544
opened Jul 4, 2026 by
FeathBow
Collaborator
Loading…
1 task done
perf(glm52): whole-step decode CUDA graph + weight-only GEMV — 200 → 25.3 ms/step
#543
opened Jul 3, 2026 by
xiaguan
Collaborator
Loading…
feat(qwen3): vLLM-prefill P/D — cross-engine KV compat, verified end-to-end
#540
opened Jul 3, 2026 by
xiaguan
Collaborator
Loading…
fix(kernels): catch C++ exceptions at the FFI boundary and surface the message
#539
opened Jul 3, 2026 by
FeathBow
Collaborator
Loading…
1 task done
fix(qwen35): guard baked head dims, not the runtime value-head count
#536
opened Jul 3, 2026 by
FeathBow
Collaborator
Loading…
1 task done
feat(kernels): verify-side target probs + chain rejection sampling primitives (#512)
#534
opened Jul 3, 2026 by
n-WN
Contributor
Loading…
perf(glm52): MLA decode arena + CUDA graph capture on top of #535 (−36%/layer) + kernel bench
#533
opened Jul 3, 2026 by
n-WN
Contributor
Loading…
feat(qwen35): add batched DFlash speculative decoding
#507
opened Jul 2, 2026 by
CAICAIIs
Collaborator
Loading…
feat(qwen3): EAGLE-3 Chain Speculative decoding
#454
opened Jun 24, 2026 by
scatyf3
Contributor
Loading…
docs(qwen35): draft TP2 phased design
#450
opened Jun 24, 2026 by
Mrtroll486
Contributor
Loading…
4 of 8 tasks
feat(qwen3-4b-dflash): add Qwen3-4B-DFlash draft model crate
#439
opened Jun 23, 2026 by
kitty-eu-org
Loading…
5 tasks done
docs(qwen35): add prefix cache design document
#423
opened Jun 18, 2026 by
Ke-Wng
Contributor
Loading…
5 tasks done
PR: feat(qwen3): n-gram speculative decoding
#349
opened Jun 11, 2026 by
wjinxu
Contributor
Loading…
docs: update DeepSeek README status section
#348
opened Jun 11, 2026 by
mvanhorn
Contributor
Loading…
feat(kernels): scaffold TVM FFI bidirectional interop test
#202
opened Jun 2, 2026 by
peter941221
Contributor
Loading…
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.