GitHub

A bounded source-linked orientation memory plus a shared state / evidence / permission contract across terminal, server, editor, bridge, and agents. Handoffs become first-class artifacts: the workbench knows where it left off, what it owes, what it changed, and how to resume. Provider abstraction lands so model selection is invisible to the rest of the system — and Hugging Face Inference Providers becomes a first-class provider alongside DeepSeek and OpenRouter, anchoring the harness's open-model story.

In scope

Evidence ledger. Every receipt from v0.8.43 + every decision card + every tool inspection + every memory entry compounds into a per-session evidence ledger. Inspectable, exportable, queryable from /evidence.
Handoff artifacts. Closing a workbench session writes a handoff artifact (goal, last state, blockers, decisions, evidence). Opening one resumes the workbench — "Resume previous workbench" surfaces matching artifacts.
Orientation cache. Bounded, source-linked, evidence-tagged. Decays as freshness drops. First-class fact source from codewhale.net/api/state.json (latest release, install commands per platform, published crates, known-bad version ranges).
Provider abstractions. Unified Provider trait in codewhale-agent consolidating env-var precedence, secret resolution, base-URL normalization, and auth-header construction (currently scattered across crates/config, crates/secrets, crates/tui/src/client.rs). ProviderKind registry becomes configurable; model selection is provider-agnostic.
Hugging Face as a first-class provider. New [providers.huggingface] config block with api_key (default HF_TOKEN, alias HUGGINGFACE_API_KEY), base_url (default https://router.huggingface.co/v1), and provider = "auto" (or a specific Inference Provider). OpenAI-compatible route. Model picker pulls model passport metadata from the HF Hub API (license, base model, context length, chat template, tool-call support, reasoning support, gated / private status). Distinct from the Hugging Face Workset (#1977) which adds Hub registry / datasets / adapters / Jobs — the two share auth but ship through different surfaces.
Cross-surface alignment. Consistent command names, output formats, error messages across CLI (codewhale), TUI, runtime API (codewhale serve --http), bridges, and web.
VS Code extension beta. Scaffold, local runtime detection, chat webview. Ship as VSIX attached to GitHub Release; not Marketplace-published until beta feedback.
Protocol contract in crates/protocol carries provider-auth shape explicitly so external clients don't have to special-case.
Per-tool migration PRs. Start ExternalTool migrations one tool at a time (git, gh, python, node, rust, cargo) with Windows CI green per step.

Out of scope

New providers beyond the HF Inference Providers integration (the rest stay as they are).
Cloud-hosted runtime API.
Marketplace publish of VS Code extension.
Plugin tool runtime implementation (still gated on v0.8.46 RFC).
Model Lab workset implementation (post-v0.9.0; see #1977). The HF Workset specifically depends on this milestone's provider work landing first.

Definition of done

Switching providers mid-session is one config change with no surrounding code change.
Hugging Face Inference Providers works end-to-end against Qwen/*, deepseek-ai/*, and meta-llama/* model IDs without per-model special-casing in the engine.
Model picker surfaces HF model passport metadata (license, context length, gated status) before selection.
/evidence returns the per-session ledger.
Closing and reopening a session restores the workbench state (active task, last decision, pending blockers).
Orientation cache surfaces the latest published release within freshness window after restart.
VS Code beta VSIX attached to v0.8.47 GitHub Release; smoke-tested against local runtime API.

Release gate

Parity gates green.
CHANGELOG.md [0.8.47] entry calls out HF as a first-class provider and provider abstractions as the model-neutral lever.
README provider matrix + "Bring your own open-weight model" section updated; HF Inference Providers and OpenRouter framed as the open-model discovery+routing layer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.47

In scope

Out of scope

Definition of done

Release gate

v0.8.47 tracker: continuity layer

token消耗超级大

show_thinking=false still wastes tokens on non-English reasoning_content

resume from a session will send 'auto' as a model name in the request

Cache hit problem

ORCA Lab compatibility: connect DeepSeek-TUI as an ORCA agent connector

✨ [Feature Request] Add configuration interface to set sub-agent model in parallel execution scenarios

Can learn Reasonix with high KV cache shoot rate function?

There still seems to be some problems with cache hits缓存命中方面似乎还是有些问题

Feature: DeepSeek cache-aware prompt diagnostics and wire payload optimization

Feat/english thinking when hidden

fix(tui): restore auto model state on session load

Slash commands: PEEK-backed command receipts and continuity

Refactor continuity layer into orientation-cache modules

有一说一，ClaudeCode与DeepSeek-TUI应用对比

Improve prefix cache inspection and warmup

feat: add DEEPSEEK.md as project context for harness integrations like superpowers

feat: configurable auto-compact threshold with Ctrl+L keybinding

token消耗增大了很多

feat: add DEEPSEEK.md as project context file

Editor context bridge: send selections, diagnostics, and diffs into CodeWhale

Session context: cap raw tool-output replay and keep details behind handles

Session logs: classify environment/tool failures before blaming the model

RLM/log workbench: route large local logs and structured payloads out of the parent transcript

`parent_entry_id` + `message_type` on the SQLite message table

v0.8.47

In scope

Out of scope

Definition of done

Release gate

List view