Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
a74b9d5
Fix: upload Tauri updater .sig sidecars (tauri-action 0.6.2 rename)
cryptopoly May 1, 2026
8e86c6c
Phase 1 chat uplift: highlighting, search, export, real cancel, effor…
cryptopoly May 1, 2026
959545e
Hide MLX-only catalog variants on non-Apple platforms
cryptopoly May 1, 2026
613e3c9
Fix Windows CUDA detection + post-install runtime probe
cryptopoly May 1, 2026
2a7cdfd
Phase 2.0 chat uplift: prompt-processing feedback + TTFT
cryptopoly May 1, 2026
dd7d20c
Preserve Windows GPU runtime on uninstall + lock extras path
cryptopoly May 1, 2026
f1d4d8a
Phase 2.0.5 watchdogs: prompt-eval timeout + memory gate + runaway gu…
cryptopoly May 1, 2026
dd284c8
Phase 2.0.5 hardening: tok/s floor, repetition guard, panic + thermal…
cryptopoly May 1, 2026
8cd4cd0
Phase 2.1 decompose ChatTab.tsx into ChatSidebar / ChatHeader / ChatT…
cryptopoly May 1, 2026
59894fd
Phase 2.2 full sampler exposure: top_p / top_k / min_p / repeat_penal…
cryptopoly May 1, 2026
90e4fc5
Phase 2.11 model capability declarations + composer auto-gating
cryptopoly May 1, 2026
0793282
Phase 2.12 mid-thread model swap with one-turn override
cryptopoly May 1, 2026
72ab7c4
Hotfix: relax memory-gate ceilings + gate vision capability by engine
cryptopoly May 1, 2026
fbb168a
Hotfix v2: visionEnabled flag gates image attach across all runtimes
cryptopoly May 1, 2026
174f47b
Phase 2.6 cross-platform RAG: semantic embedding via llama-embedding …
cryptopoly May 1, 2026
260c64e
Wire --mmproj for llama.cpp vision: sibling detection + visionEnabled…
cryptopoly May 1, 2026
91965e5
Phase 2.10 MCP client: stdio JSON-RPC + tool adapter + provenance
cryptopoly May 1, 2026
ce53f28
Phase 2.8 structured tool output: tools render as table / code / mark…
cryptopoly May 1, 2026
07dd06c
Phase 2.4 conversation branching: fork from any assistant message
cryptopoly May 1, 2026
f583d42
Phase 2.5 in-thread compare: sibling variants under assistant bubble
cryptopoly May 2, 2026
b26e58d
Phase 2.11 capability badges: typed flags surface across all model pi…
cryptopoly May 2, 2026
3a37e77
Phase 2.2 close-out: JSON-schema constrained-output opt-in
cryptopoly May 2, 2026
db1acce
Phase 2.7 prompt presets + variables: fill-form before Use in Chat
cryptopoly May 2, 2026
e294021
Phase 2.13 OpenAI-compatible server: full sampler chain + embeddings
cryptopoly May 2, 2026
8907709
Phase 2.14 catalog browser: VRAM-fit hints on Discover variants
cryptopoly May 2, 2026
26bc0b7
Reasoning panel: collapsible streaming preview + close first-paragrap…
cryptopoly May 2, 2026
0d8b7f2
Phase 3.4 substrate routing inspector: per-turn badge above metrics
cryptopoly May 2, 2026
7c369ff
Phase 3.2 KV strategy chip: per-turn cache override in composer
cryptopoly May 2, 2026
e343fbe
Phase 3.8 chat-template inspection: detect Gemma + ChatML quirks
cryptopoly May 2, 2026
c510b4d
Phase 3.5 cross-platform perf telemetry: per-turn host strip
cryptopoly May 2, 2026
f969a4f
Phase 3.6 Delve mode: critic-pass on assistant messages
cryptopoly May 2, 2026
7207113
Phase 3.7 workspace knowledge stacks: shared RAG corpus across sessions
cryptopoly May 2, 2026
67807b5
Phase 3.3 logprobs viz (advanced-mode gated): per-message confidence …
cryptopoly May 2, 2026
9237355
Phase 3.1 DDTree accepted-token overlay: substrate truth view
cryptopoly May 2, 2026
1723a38
KV chip + DFlash UX hotfixes from smoke test feedback
cryptopoly May 2, 2026
db861fa
Phase 3.1 + 3.8 follow-ups: DDTree-tree spans + llama.cpp chat-templa…
cryptopoly May 2, 2026
e4f44c2
Phase 3.3 follow-up: MLX logprobs passthrough on streaming path
cryptopoly May 2, 2026
e25824b
Trigger build workflow on staging too
cryptopoly May 2, 2026
1241ca7
Surface Windows discover bugs and add Qwen 3.6 catalog entry
cryptopoly May 2, 2026
a43edb9
FU-015..FU-021: image+video perf bundle (FBCache, SDXL VAE fp16, dist…
cryptopoly May 3, 2026
2401c78
Wire STG slider through to mlx-video subprocess + preset-row-pair styles
cryptopoly May 3, 2026
23447c7
Bump version to 0.7.4
cryptopoly May 3, 2026
80c0874
KV cache chip: harmonize filter with launch-settings modal
cryptopoly May 3, 2026
af61e82
FU-001 close-out: bump turboquant-mlx-full to >=0.3.0
cryptopoly May 3, 2026
676ebd8
Audit phases 1-4 + multimodal images + Gemma 4 channel filter
cryptopoly May 4, 2026
1110e6f
Phase 5 frontend UX: previewVae toggles + kvBudget schema
cryptopoly May 4, 2026
3e40152
Bug 2.1 + CLI runner: Gemma 4 asymmetric channel filter
cryptopoly May 4, 2026
f5684aa
Phase 7 v1: mlx-video Wan convert foundation (FU-025)
cryptopoly May 4, 2026
9d959a4
Phase 8: mlx-video Wan runtime routing (FU-025 closeout)
cryptopoly May 4, 2026
6bb562b
Phase 9: GUI install action for Wan MLX runtime (FU-025 fully closed)
cryptopoly May 4, 2026
e8e1c27
Restore pre-aec1975 card layout for Image/Video Discover + My Models
cryptopoly May 4, 2026
1017ccb
[mlx-vlm] add torchvision dep for Qwen2.5-VL processor build
cryptopoly May 4, 2026
e228e41
Restore catalog tabs to v0.7.2 layout exactly + drop duplicate Wan panel
cryptopoly May 4, 2026
bcf88de
FU-009 close-out: live Wan2.1 MLX smoke + status_for upstream-layout fix
cryptopoly May 4, 2026
9d15842
FU-018 part 1 close-out: preview VAE swap validated end-to-end
cryptopoly May 4, 2026
15b3fe5
FU-006 quarterly re-verify: hold at f825ffb (v0.1.4.1)
cryptopoly May 4, 2026
412d7a6
FU-018 part 2: live denoise thumbnails via callback_on_step_end
cryptopoly May 4, 2026
f08e45c
FU-022: LLM-based prompt enhancer (Apple Silicon)
cryptopoly May 4, 2026
fe34a2c
Restore Wan MLX runtime install UX surface (FU-025 part 9)
cryptopoly May 5, 2026
ddec20d
FU-006 close-out: dflash-mlx pin bump f825ffb -> 8d8545d (v0.1.4.1 ->…
cryptopoly May 5, 2026
bc12d5c
FU-023 + FU-024 + FU-027: CUDA quantization foundations
cryptopoly May 5, 2026
7c0dbc2
FU-024: Studio FP8 layerwise toggle in Image + Video Studio
cryptopoly May 5, 2026
9c62887
Add Windows PowerShell ports of build-llama-turbo + build-sdcpp
cryptopoly May 5, 2026
d0d4f3c
Windows ps1: replace em-dash with ASCII -- so PowerShell parses cleanly
cryptopoly May 5, 2026
f5ef002
Pick a CMake generator explicitly in build-llama-turbo.ps1
cryptopoly May 5, 2026
ee1e3a4
Wipe stale CMake cache when build-llama-turbo switches generator
cryptopoly May 5, 2026
40f8640
Drop -SimpleMatch from CMake cache generator probe
cryptopoly May 5, 2026
861a81a
Detect missing MSVC up front in build-llama-turbo.ps1
cryptopoly May 5, 2026
ee49c4e
Accept VS Build Tools installs that report isComplete=0
cryptopoly May 5, 2026
3a89cf7
Append version= to CMAKE_GENERATOR_INSTANCE for unregistered installs
cryptopoly May 5, 2026
f6c4aea
Auto-sync CUDA VS integration before cmake configure
cryptopoly May 5, 2026
313dd8e
Fix CUDA-integration elevated copy and invalidate stale CMake cache
cryptopoly May 5, 2026
a8a360d
Extract Windows MSVC/CUDA helpers and apply to build-sdcpp.ps1
cryptopoly May 5, 2026
2ce995b
Use python -m pip in build.ps1 to dodge Windows self-upgrade refusal
cryptopoly May 5, 2026
74a1fa6
Diagnose T5EncoderModel error and right-size CogVideoX footprints
cryptopoly May 5, 2026
b352258
Surface CPU torch on CUDA host + raise chat default maxTokens to 4096
cryptopoly May 5, 2026
e6aa419
Fix Studio cache preview returning 0 GB on chat model selection
cryptopoly May 5, 2026
4c5cd79
Make chat cache-fit warning VRAM-aware on CUDA hosts
cryptopoly May 5, 2026
a77f738
Merge branch 'feature/chat-level-up' of https://github.com/cryptopoly…
cryptopoly May 5, 2026
94c6bf0
Run T5 lazy-import diagnostic on generate paths too
cryptopoly May 5, 2026
25bbe0c
Fix Video Studio dropping GPU warning + add inline Install button
cryptopoly May 5, 2026
d78aaa4
Add expandable per-attempt log under Install CUDA torch button
cryptopoly May 5, 2026
5e016fe
Make Install CUDA torch self-debugging + add Restart prompt
cryptopoly May 5, 2026
a047896
Remove Convert Model action + nudge My Models row icons left
cryptopoly May 5, 2026
65f807e
Fix Windows diffusion runtime readiness
cryptopoly May 5, 2026
4ce8b48
Merge pull request #32 from cryptopoly/feature/chat-level-up
cryptopoly May 6, 2026
59cd5a6
Merge pull request #33 from cryptopoly/merge/pr-32-into-staging
cryptopoly May 6, 2026
e60a85b
Revert "Merge pull request #33 from cryptopoly/merge/pr-32-into-staging"
cryptopoly May 6, 2026
ff3ea9e
Merge pull request #32 from cryptopoly/feature/chat-level-up
cryptopoly May 6, 2026
cea9084
Merge pull request #34 from cryptopoly/merge/pr-32-into-staging-redo
cryptopoly May 6, 2026
f019b8b
Release prep v0.7.4: fix sys.path shim shadowing + changelog
cryptopoly May 6, 2026
dd59178
Merge pull request #35 from cryptopoly/release/v0.7.4-prep
cryptopoly May 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Pin line endings on text files so cross-platform contributors don't
# see phantom "modified" diffs from autocrlf-driven CRLF<->LF flips.
#
# Background: Windows users with `core.autocrlf=true` (the Git for
# Windows default) see Cargo.toml / tauri.conf.json / etc. as modified
# the moment they `git checkout` because the working-tree copy gets
# rewritten with CRLF while origin's blobs are LF. Without this file,
# every status check on Windows lights those up as dirty even though
# no real change was made. With this file, git normalizes them on the
# way in and out and the status stays clean.

# Default: treat as text, normalize to LF in the index. The working
# tree gets the platform's native line ending on checkout (LF on
# macOS/Linux, LF on Windows-with-`core.eol=lf`, CRLF on
# Windows-with-default-config).
* text=auto

# Repo-shape files MUST stay LF in the working tree everywhere -- the
# Tauri / Cargo / npm toolchains all read them with LF assumptions
# even on Windows, and a CRLF-shaped tauri.conf.json caused real
# parse failures earlier in the project history (see the patch-
# tauri-conf.mjs script's "self-heal an empty/corrupt JSON" branch).
*.toml text eol=lf
*.json text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
*.md text eol=lf

# Source files: LF everywhere. Vite + tsc handle either, but pinning
# avoids whitespace-only diffs in PRs.
*.ts text eol=lf
*.tsx text eol=lf
*.js text eol=lf
*.jsx text eol=lf
*.mjs text eol=lf
*.cjs text eol=lf
*.py text eol=lf
*.rs text eol=lf
*.css text eol=lf
*.html text eol=lf

# Shell scripts: LF (would otherwise silently break on macOS / Linux
# with "bad interpreter" errors when bash sees \r in the shebang).
*.sh text eol=lf

# PowerShell: CRLF. The PS 5.1 parser handles either but PowerShell
# scripts authored on Windows traditionally ship CRLF, and Windows
# editors would otherwise rewrite them on save and produce noise.
*.ps1 text eol=crlf
*.psm1 text eol=crlf
*.psd1 text eol=crlf

# Binary blobs that Git would otherwise try to diff/normalize. Mark
# them explicitly so a `text=auto` heuristic mistake can't corrupt
# them on a cross-platform clone.
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.webp binary
*.ico binary
*.icns binary
*.woff binary
*.woff2 binary
*.ttf binary
*.otf binary
*.zip binary
*.gz binary
*.tar binary
*.7z binary
*.exe binary
*.dll binary
*.so binary
*.dylib binary
*.pyd binary
*.safetensors binary
*.gguf binary
*.bin binary
*.onnx binary
6 changes: 3 additions & 3 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ name: Build Desktop App
# Tests run on every push and PR (quick feedback), but the expensive
# 3-platform desktop build matrix only runs on manual trigger —
# `workflow_dispatch` from the Actions tab or `gh workflow run`.
# Pushes to main no longer kick off a full cross-platform build.
# Pushes to main / staging no longer kick off a full cross-platform build.
on:
push:
branches: [main]
branches: [main, staging]
pull_request:
branches: [main]
branches: [main, staging]
workflow_dispatch:

env:
Expand Down
59 changes: 59 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,64 @@
# Changelog

## v0.7.4 - 2026-05-06

### Cache strategies & generation quality (FU-015 → FU-021, FU-026)
- **First Block Cache** (cross-platform diffusion cache hook, registry id `fbcache`) backed by `diffusers.hooks.apply_first_block_cache`. Applies to image + video DiTs (FLUX, SD3.5, Wan2.1/2.2, HunyuanVideo, LTX-Video, CogVideoX, Mochi). Default threshold 0.12 (≈1.8× speedup on FLUX.1-dev with imperceptible drift). Closes the FU-007 Wan TeaCache deferral by replacing per-model vendoring with a model-agnostic hook.
- **TaylorSeer / MagCache / PyramidAttentionBroadcast / FasterCache** strategies wired against the diffusers 0.38 native `enable_cache(<Config>)` API (registry ids `taylorseer`, `magcache`, `pab`, `fastercache`). MagCache is FLUX-only without calibration UX; other DiTs raise a "calibration required" message.
- **SDXL VAE fp16 fix on MPS / CUDA** (FU-017) — probes `madebyollin/sdxl-vae-fp16-fix` via `local_files_only=True` and swaps `pipeline.vae` so SDXL on Apple Silicon stays in fp16 instead of falling back to fp32.
- **Distill LoRA + transformer support** (FU-019) — Hyper-SD-8step + Turbo-Alpha for FLUX.1-dev, CausVid for Wan2.1 1.3B/14B, plus full distilled transformer swap (`distillTransformer*` fields) for Wan 2.2 A14B I2V × lightx2v 4-step distill (bf16 + fp8_e4m3 variants). Distill takes precedence over LoRA when both are pinned.
- **AYS (Align Your Steps) sampler** (FU-020) for SD/SDXL — new `ays_dpmpp_2m_sd15` / `ays_dpmpp_2m_sdxl` samplers using NVIDIA's hardcoded timestep arrays. Flow-match models continue to be gated out.
- **Image-runtime CFG decay parity** (FU-021) with the video runtime — opt-in `cfgDecay` field, linear ramp from initial guidance down to a 1.5 floor inside `callback_on_step_end`. Gated to flow-match repos.

### CUDA quantization foundations (FU-023, FU-024, FU-027)
Backend wiring landed for Windows / Linux CUDA validation; Apple Silicon dev box can't exercise these paths live.
- **Nunchaku / SVDQuant transformer load** (FU-023) — `_try_load_nunchaku_transformer` helper preferred over NF4 / int8wo on CUDA when `nunchakuRepo` pinned + `nunchaku>=1.2.1` importable. Catalog rows for FLUX.1-dev × svdq-int4 + FLUX.1-schnell × svdq-int4.
- **FP8 layerwise casting for non-FLUX DiTs** (FU-024) — `_maybe_enable_fp8_layerwise` helper on both image + video runtimes. Family-correct fp8 dtype (E5M2 for HunyuanVideo per upstream, E4M3 elsewhere). Compute capability gate refuses pre-Ada GPUs (SM <8.9). Studio toggle exposed in both Image + Video Studio.
- **NVIDIA/kvpress install action** (FU-027) — `kvpress>=0.5.3` registered in `_INSTALLABLE_PIP_PACKAGES` so the Setup tab can pre-stage the wheel ahead of integration code.

### MLX video runtime (FU-009 close-out, FU-025 Phases 7 → 9)
- **mlx-video Wan one-shot convert pipeline** under `~/.chaosengine/mlx-video-wan/<slug>/` (override via `CHAOSENGINE_MLX_VIDEO_WAN_DIR`). Helper `backend_service/mlx_video_wan_convert.py` wraps the upstream `python -m mlx_video.models.wan_2.convert` subprocess with `slug_for` / `output_dir_for` / `status_for` / `list_converted` / `run_convert`.
- **Runtime routing for `Wan-AI/Wan2.{1,2}-*`** through `mlx_video_runtime.py` — `_REPO_ENTRY_POINTS["Wan-AI/"] = "mlx_video.models.wan_2.generate"`, `_build_wan_cmd` produces the Wan-shaped CLI (`--model-dir`, `--guide-scale` string, `--scheduler`).
- **GUI install panel under Video Discover** — `WanInstallPanel.tsx` lists every supported Wan repo with raw-size hint + converted badge / install button + live `InstallLogPanel`. Setup endpoints `POST /api/setup/install-mlx-video-wan` + status + inventory mirror the longlive install pattern.
- **Live Wan2.1 MLX smoke validation** — 19.6s end-to-end at 480×272, 5 frames, 4 steps; surfaced + fixed a `status_for` filename gap (mlx-video upstream emits root-level `model.safetensors` + `t5_encoder.safetensors`, not the legacy `transformer*.safetensors` pattern).

### Preview & enhancement UX (FU-018 parts 1+2, FU-022)
- **TAESD / TAEHV preview VAE swap** (FU-018 part 1) — `maybe_apply_preview_vae(pipeline, repo, enabled)` maps repo → tiny VAE id (FLUX.1/2 → taef1/taef2, SD3 → taesd3, SDXL → taesdxl, Wan2.x → taew2_2, LTX-Video / LTX-2 → taeltx2_3_wide, HunyuanVideo → taehv1_5, CogVideoX → taecogvideox, Mochi → taemochi, Qwen-Image → taeqwenimage). Mirrors the stock VAE's dtype + device.
- **Per-step thumbnails via `callback_on_step_end`** (FU-018 part 2) — decodes `callback_kwargs["latents"]` through the swapped tiny VAE, scales to ≤192 px, base64-encodes a PNG, publishes to `IMAGE_PROGRESS.set_thumbnail` / `VIDEO_PROGRESS.set_thumbnail`. Stride caps emit count at ~8 (image) / ~6 (video) per gen. Frontend renders inside `LiveProgress`. Handles standard 4D `(B, C, H, W)` and FLUX's packed 3D `(B, seq_len, 64)` shapes.
- **MLX-native LLM prompt enhancer** (FU-022) — replaces the deterministic per-family template-suffix enhancer. Helper `backend_service/helpers/prompt_enhancer.py` wraps `mlx_lm.load` + `mlx_lm.generate` against `mlx-community/Qwen2.5-0.5B-Instruct-4bit` (~700 MB on disk, ~3s cold load + sub-second per call). Per-family system prompts (`wan` / `ltx` / `hunyuan` / `flux` / `sdxl` / `sd3` / `default`) anchor the rewrite to the DiT's training distribution. Endpoint `POST /api/prompt/enhance`. Apple Silicon only — CUDA / Linux fall back to the legacy template suffix.

### Speculative decoding
- **`dflash-mlx` pin bump** (FU-006) f825ffb → 8d8545d (v0.1.4.1 → v0.1.5.1). 0.1.5+ moved every primitive `backend_service/ddtree.py` consumed off the runtime top-level onto a per-family `target_ops` adapter. Adapter resolved once at the top of `generate_ddtree_mlx` via `resolve_target_ops(target_model)`. Gains: draft model quantization with Metal MMA kernels, branchless Metal kernels + fused draft KV projections, long-context runtime diagnostics. Live smoke validated against `mlx-community/Qwen2.5-0.5B-Instruct-4bit`.

### Windows / CUDA stability
- PowerShell ports of `build-llama-turbo` + `build-sdcpp` for Windows builds.
- MSVC + CUDA detection helpers, CMake generator handling — accept VS Build Tools installs that report `isComplete=0`, append `version=` to `CMAKE_GENERATOR_INSTANCE` for unregistered installs, fix CUDA-integration elevated copy + invalidate stale CMake cache.
- CUDA torch self-debugging install button with expandable per-attempt log + Restart prompt.
- Video Studio dropping GPU warning on CUDA hosts now surfaces inline Install button.
- T5 lazy-import diagnostic runs on generate paths (not just startup) to catch missing-dep failures before kicking off long generations.

### Studio polish & chat
- Restored pre-aec1975 card layout for Image / Video Discover + My Models, dropped the duplicate Wan panel that had been leaking through the catalog tabs.
- KV cache chip filter harmonized with the launch-settings modal so toggle states stay consistent across surfaces.
- Chat cache-fit warning is now VRAM-aware on CUDA hosts; raised chat default `maxTokens` to 4096; surfaced CPU torch on CUDA host with right-sized CogVideoX footprints.
- Fixed Studio cache preview returning 0 GB on chat model selection.

### Test infrastructure & runtime safety
- **`backend_service/runtime_paths.py` — append extras to `sys.path`** instead of `insert(1, ...)`. Prepending broke repo-local adapter shims (notably `turboquant_mlx`, which wraps the upstream `turboquant-mlx-full` install in extras): the raw upstream package shadowed the shim, hiding the shim's exported helpers (`_find_pip_turboquant_path`, `make_adaptive_cache`, `apply_patch`). Surfaced as a pytest collection failure on `tests/test_cache_strategies.py`; was also a latent runtime bug after a user clicked Setup → Install turboquant-mlx-full.

### Packaging
- Bumped the application version to `0.7.4` across the npm, Python, and Tauri package metadata.

## v0.7.3 - 2026-05-04

- Bumped the application version 0.6.0 → 0.7.3 across the npm, Python, and Tauri package metadata. No tagged GitHub Release; superseded by v0.7.4.

## v0.7.2 - 2026-05-02

- Wired the STG (Spatial Temporal Guidance) slider through to the mlx-video subprocess for LTX-2 generations.
- Added preset-row-pair styles for the Studio preset chooser.
- Harmonized the KV cache chip filter with the launch-settings modal so toggle states stay consistent across surfaces.

## v0.6.0 - 2026-04-19

- Renamed the local `compression/` package to `cache_compression/` so it no longer shadows Python 3.14's PEP 784 stdlib `compression` namespace package. Fixes a `ModuleNotFoundError: No module named 'compression._common'` surfacing on Windows with Python 3.14 when PyTorch's import chain reached into the shadowed package.
Expand Down
Loading