Releases · cryptopoly/ChaosEngineAI

06 May 18:47

github-actions

v0.7.4

cfa1d53

ChaosEngineAI v0.7.4 Latest

Latest

v0.7.4 — chat uplift + image/video gen polish

Chat experience (the headline)

Phase 1 — UX foundations

Syntax highlighting in code blocks, in-thread search, conversation export, real cancel (mid-stream abort), reasoning-effort levels.
Reasoning panel: collapsible streaming preview, fixed first-paragraph gap.

Phase 2.0 — perf surface

Prompt-processing feedback + TTFT (time-to-first-token) live indicator.
Watchdogs: prompt-eval timeout, memory gate, runaway guards (token rate floor, repetition guard), panic + thermal banners, image/video gates that block kicking off a generation when VRAM/RAM headroom is unsafe.

Phase 2.1 — refactor

Decomposed monolithic ChatTab.tsx into ChatSidebar / ChatHeader / ChatThread / ChatComposer.

Phase 2.2 — sampler control

Full sampler exposure: top_p, top_k, min_p, repeat_penalty, seed, mirostat, reasoning_effort.
JSON-schema constrained-output opt-in (json_schema field).

Phase 2.4–2.5 — message-tree workflows

Conversation branching: fork from any assistant message into a sibling thread.
In-thread compare: render sibling variants side-by-side under the assistant bubble.

Phase 2.6–2.7 — context & prompts

Cross-platform RAG: semantic embedding via llama-embedding + cosine retrieval over local docs.
Prompt presets + variables: fill-form before "Use in Chat" so reusable prompts can take inputs.

Phase 2.8 — structured tool output

Tool call results render as table / code / markdown / image based on returned shape, not raw JSON.

Phase 2.10 — MCP client

Stdio JSON-RPC transport + tool adapter so any local MCP server is callable from chat. Provenance shown per tool result.

Phase 2.11–2.12 — model-aware composer

Typed capability declarations (vision / tools / json_schema / reasoning) surface as badges in every model picker.
Composer auto-gating (e.g. attach-image button hidden when active model has no vision).
Mid-thread model swap with one-turn override (try a different model for a single response, then revert).

Phase 2.13 — OpenAI-compatible server

Full sampler chain + embeddings parity. Apps that talk to /v1/chat/completions no longer lose advanced sampler params on the way through.

Phase 2.14 — catalog browser

VRAM-fit hints on every Discover variant card so you see at a glance what'll actually run on your machine.

Phase 3.x — substrate transparency

KV strategy chip in composer: per-turn cache override (native / chaosengine / rotorquant / turboquant / triattention) without touching launch settings.
DDTree accepted-token overlay: substrate truth view of which speculative draft tokens were accepted.
Logprobs viz (advanced-mode gated): per-message confidence summary, MLX logprobs streaming passthrough.
Substrate routing inspector: per-turn badge above the metrics row showing which engine + binary served the response.
Per-turn host strip: cross-platform perf telemetry (CPU / GPU / RAM / temp).
Delve mode: critic-pass on assistant messages.
Workspace knowledge stacks: shared RAG corpus across sessions.
Chat-template inspection: detect Gemma + ChatML quirks, llama.cpp chat-template fix.

Image generation

First Block Cache cross-platform diffusion cache hook (diffusers.hooks.apply_first_block_cache). Default threshold 0.12, ≈1.8× speedup on FLUX.1-dev with imperceptible drift. Replaces the per-model TeaCache vendoring deferral.
TaylorSeer / MagCache / PyramidAttentionBroadcast / FasterCache strategies wired against diffusers 0.38 native API.
SDXL VAE fp16 fix on MPS / CUDA — keeps SDXL on Apple Silicon in fp16 instead of the slow fp32 fallback.
Distill LoRA support — Hyper-SD-8step + Turbo-Alpha for FLUX.1-dev.
AYS (Align Your Steps) sampler for SD/SDXL.
CFG decay parity with the video runtime (opt-in cfgDecay field).
Live denoise thumbnails via callback_on_step_end — TAESD/TAEHV preview VAE swap decodes per-step latents into ≤192 px PNG thumbnails streamed to the UI. Handles 4D (B, C, H, W) and FLUX's packed 3D (B, seq_len, 64) shapes.
MLX-native LLM prompt enhancer (Apple Silicon) — mlx-community/Qwen2.5-0.5B-Instruct-4bit rewrites your prompt into the active DiT's training distribution. Per-family system prompts for FLUX / Wan / LTX / HunyuanVideo / SDXL / SD3.
Vision attach gating — visionEnabled flag gates image attach across all runtimes; --mmproj wired for llama.cpp vision with sibling detection.

Video generation

mlx-video Wan runtime end-to-end (Apple Silicon):
- One-shot convert pipeline for Wan-AI/Wan2.{1-T2V-1.3B,1-T2V-14B,2-TI2V-5B,2-T2V-A14B,2-I2V-A14B} — wraps python -m mlx_video.models.wan_2.convert subprocess.
- Runtime routing through mlx_video_runtime.py with Wan-shaped CLI (--model-dir, --guide-scale, --scheduler).
- GUI install panel under Video Discover with per-repo install buttons + live install log.
- Live Wan2.1 MLX smoke validated: 19.6s end-to-end at 480×272, 5 frames, 4 steps.
Distill transformer support for Wan 2.2 A14B I2V (lightx2v 4-step, bf16 + fp8_e4m3 variants) — full transformer swap via _swap_distill_transformers.
STG (Spatial Temporal Guidance) slider wired through to mlx-video subprocess for LTX-2.
CogVideoX footprints right-sized + T5EncoderModel error diagnosed.

CUDA quantization (Windows / Linux foundation)

Nunchaku / SVDQuant transformer load (FLUX.1-dev + FLUX.1-schnell svdq-int4 variants). Preferred over NF4/int8wo when nunchaku>=1.2.1 is installed.
FP8 layerwise casting for non-FLUX DiTs (Wan / Qwen-Image / SD3 / LTX). Family-correct fp8 dtype (E5M2 for HunyuanVideo, E4M3 elsewhere). Compute capability gate refuses pre-Ada GPUs (SM <8.9).
NVIDIA/kvpress install action staged (kvpress>=0.5.3 registered).
Studio FP8 layerwise toggle in both Image + Video Studio.

Speculative decoding

dflash-mlx pin bump f825ffb → 8d8545d (v0.1.4.1 → v0.1.5.1) — target_ops adapter pattern, draft model quantization with Metal MMA kernels, branchless Metal kernels, fused draft KV projections.

Windows / CUDA stability

PowerShell ports of build-llama-turbo + build-sdcpp.
MSVC + CUDA detection helpers: accept VS Build Tools installs that report isComplete=0, append version= to CMAKE_GENERATOR_INSTANCE for unregistered installs, fix CUDA-integration elevated copy + invalidate stale CMake cache, auto-sync CUDA VS integration before cmake configure.
Install CUDA torch: self-debugging button with expandable per-attempt log + Restart prompt.
Windows CUDA detection fix + post-install runtime probe.
Preserve Windows GPU runtime on uninstall + lock extras path.
Video Studio dropping GPU warning now surfaces inline Install button.
T5 lazy-import diagnostic on generate paths (catches missing-dep failures before kicking off long generations).

Studio polish

Restored pre-aec1975 card layout for Image / Video Discover + My Models. Dropped the duplicate Wan panel.
KV cache chip filter harmonized with launch-settings modal so toggle states stay consistent across surfaces.
Chat cache-fit warning VRAM-aware on CUDA hosts.
Surfaced CPU torch on CUDA host. Raised chat default maxTokens to 4096.
Fixed Studio cache preview returning 0 GB on chat model selection.
Hide MLX-only catalog variants on non-Apple platforms.
Qwen 3.6 catalog entry.

Test infrastructure

backend_service/runtime_paths.py — append extras to sys.path instead of insert(1, ...). Repo-local adapter shims (notably turboquant_mlx) keep import authority across pytest, dev .venv, and Tauri-bundled launches. Was also a latent runtime bug masking the shim's adapter hooks after Setup → Install turboquant-mlx-full on the desktop app.

Bundles below: macOS aarch64 dmg + Linux x64 AppImage / deb + Windows x64 setup.exe. latest.json for Tauri auto-updater.

Assets 13

ChaosEngineAI.app.tar.gz.sig

sha256:7aa8f1f678a3f9d8d9c952abee93a92a042199c465ce7565a562a7b78f49778d

412 Bytes 2026-05-06T18:42:46Z
ChaosEngineAI_0.7.4_aarch64.dmg

sha256:b0dd4472e090a6638b206e99894965f47bb7192b03741bac5efef89baa481633

91.4 MB 2026-05-06T18:42:32Z
ChaosEngineAI_0.7.4_amd64.AppImage

sha256:2bccb410cd497be0dcca0505fdfb0de9bd536adfc111722eb09fe82c9b778493

151 MB 2026-05-06T18:44:16Z
ChaosEngineAI_0.7.4_amd64.AppImage.sig

sha256:db381607d403db2389be9bbcc4194387b3024bbc31a67a4db38feeeab1210387

428 Bytes 2026-05-06T18:44:26Z
ChaosEngineAI_0.7.4_amd64.deb

sha256:221b98c1e58cdce4737541503dc737b76275c0286d618d2a2913bf512ce1c920

77.8 MB 2026-05-06T18:44:13Z
ChaosEngineAI_0.7.4_amd64.deb.sig

sha256:c46c3bbe55b7287434577d7f07aacb0169be219fbe42c0706f9e198f72cc4c97

420 Bytes 2026-05-06T18:44:25Z
ChaosEngineAI_0.7.4_x64-setup.exe

sha256:0b3021de3b4fb58022763718e43bacf27febe3f3f6329a1b6c370fcc8660def9

106 MB 2026-05-06T18:45:43Z
ChaosEngineAI_0.7.4_x64-setup.exe.sig

sha256:4164543387e34e2379a3a7cf0b7d1db2bcd6cb986a49570b71388d369977445c

424 Bytes 2026-05-06T18:45:52Z
ChaosEngineAI_aarch64.app.tar.gz

sha256:b471f58abe56412b7e4295628d3ff0618886652d3e8233dd4a8552b2fc335245

90.6 MB 2026-05-06T18:42:38Z
ChaosEngineAI_aarch64.app.tar.gz.sig

sha256:7aa8f1f678a3f9d8d9c952abee93a92a042199c465ce7565a562a7b78f49778d

412 Bytes 2026-05-06T18:42:43Z
Source code (zip)

2026-05-06T18:36:32Z
Source code (tar.gz)

2026-05-06T18:36:32Z

02 May 22:28

github-actions

v0.7.2

bc5c3b0

ChaosEngineAI v0.7.2

ChaosEngineAI v0.7.2 packages everything that landed since v0.5.2 — 133 commits over two weeks. First stable release of the video generation line, alongside major work on diffusion cache compression, the Apple Silicon mlx-video runtime, the Windows installer pipeline, and a brand-new in-app Diagnostics panel. Includes ten post-tag smoke-test fixes (PRs #21–#30) that landed before the final rebuild.

Highlights

Video Studio — full video generation tab with model picker, prompt input, runtime status overlay, real-time progress, and cancellable generation.
Multi-engine video catalog — Wan 2.1 T2V (1.3B / 14B), Wan 2.2, Lightricks LTX-Video 2.0/2.0-distilled/2.3/2.3-distilled, HunyuanVideo, CogVideoX, Mochi, plus FLUX.1 for image.
TeaCache diffusion cache compression — five DiT families wired (FLUX, HunyuanVideo, LTX-Video, CogVideoX, Mochi) with per-model rescale coefficients.
In-app Diagnostics panel under Settings, with per-section error fallback and platform-aware repair actions.
Windows installer pipeline stabilised — embedded Python sidecar + llama-server now ship in the NSIS bundle; PowerShell 5.1 build path hardened; CUDA torch DLL lock fixed; GGUF downloads scoped to lightweight base + selected quant.
macOS Python framework switched to python-build-standalone (Astral) for reliable framework relocation.

Smoke-test fixes (post-tag, in this rebuild)

#21 — MLX-only catalog variants (FLUX.1 Dev · mflux, LTX-2 MLX entries) hidden on Linux / Windows where mlx is not installable. New backend_service/helpers/platform_filter.py + 15 unit tests.
#22 — Windows CUDA detection no longer reports system RAM as VRAM. Reads torch.cuda.get_device_properties(0).total_memory first; falls back to nvidia-smi; returns vram_total_gb=None when neither answers. Image-runtime probe calls importlib.invalidate_caches() so newly-installed GPU bundle packages are picked up without a full backend restart.
#23 — Tauri NSIS installer hook (src-tauri/installer.nsh) documents the contract that %LOCALAPPDATA%\ChaosEngineAI\extras\cp{maj}{min}\site-packages must survive uninstall + reinstall. InstallLogPanel CSS establishes a stacking context so the streaming pip output renders above the Prompt + Recent Outputs cards on Windows during a long GPU bundle install.
#24 — CI test reliability: imageDiscoverMemoryEstimate describe block pins navigator.userAgentData.platform to macOS so the MPS-calibrated expectations pass on Linux runners.
#25 — torch.cuda probe now runs in a short-lived subprocess so the backend never holds locks on torch/lib/*.dll. Without this, pip's --target install of a fresh torch fails with [WinError 5] Access is denied. InstallLogPanel background switched to opaque var(--surface) + contain: layout so it can never visually overlap sibling Prompt / Recent Outputs cards.
#26 — test_gpu.py accepts vram_total_gb=None as the no-GPU response (orphaned PR #24 fix landed direct).
#27 / #28 — Tauri 2.10.3 → 2.11.0; tauri-build 2.5.6 → 2.6.0 (Dependabot Cargo bumps + Windows package version alignment).
#29 — HunyuanVideo NF4 + 1280×720 × 33 frames CUDA estimate no longer crosses the danger ratio. Discount sliced attention to 60% of the dense fp16 8-slab estimate when CUDA + runtime override are present (real fp8-KV / attention-slicing behaviour).
#30 — test_legacy_tauri_app_data_extras_are_import_candidates patches sys.platform = "win32" so the Linux CI runner exercises the LOCALAPPDATA branch.

Video Generation

New: Video Studio

New Video Studio tab with model picker, prompt input, runtime status overlay, and live progress.
Cancellable image and video generation across the app.
Async library scan with persisted cache (library_cache.json) — startup no longer blocks on filesystem walks.
Tooltip portal — info hovers now render via a top-level portal with viewport-clamped placement.
Platform-aware mlx-video reinstall button in Diagnostics.
Auto-recover Video Studio after a backend sidecar crash.
Variant-specific video download status (sibling Q4/Q6/Q8 rows no longer marked active just because they share repos).
Cleaned-up Video Studio runtime action layout; GPU runtime installs now recover from CPU-only torch wheels.

New: Video catalog

Wan 2.1 T2V 1.3B / 14B (with Wan 2.2 catalog fix).
Lightricks LTX-Video 2.0 / 2.0-distilled / 2.3 / 2.3-distilled (via mlx-video on Apple Silicon).
HunyuanVideo.
CogVideoX.
Mochi.
FLUX.1 (image).

New: Apple Silicon mlx-video LTX-2 engine

LTX-2 (prince-canuma/LTX-2-{distilled,dev,2.3-distilled,2.3-dev}) routed through a subprocess engine — backend_service/mlx_video_runtime.py.
Spatial upscaler resolver, distilled-pipeline accounting, dist-info cleanup.
Real LTX-2 module path + corrected CLI flags.
LTX-2 MLX download false-positive fix + LTX prompt-length hint.

New: stable-diffusion.cpp engine scaffold (cross-platform)

Binary staging in scripts/stage-runtime.mjs; path resolution in src-tauri/src/lib.rs (resolve_sd_cpp + CHAOSENGINE_SDCPP_BIN_DIR).
Engine class SdCppVideoEngine in backend_service/sdcpp_video_runtime.py.
Manager exposes sdcpp_video_capabilities() so Setup / Studio can surface staging state.

New: LongLive scaffold (CUDA only)

Real-time causal long video generation for Wan 2.1 T2V 1.3B.
Install path — collapsible terminal panel for streaming progress, hang fix on Windows.
Discover Install CTA in Video Studio.

Quality + safety

Phase E1 — auto-enhance short video prompts with model-tuned suffixes.
Phase E2 — CFG decay schedule + extend prompt enhancer to LTX-2 family.
Three Phase E2 regressions caught on real Mac runs.
Tuned video safety estimator + LTX-2 subprocess error visibility.
Improved video gen quality across LTX / Wan / HunyuanVideo + new engine scaffolds.
Scale video gen safety by device memory and show Studio capacity.
Detect corrupt diffusers snapshots, add output-folder pickers, MPS-safe defaults.
In-app install for mp4 encoder deps; isolated video output dir in test harness.
HunyuanVideo NF4 + 1280×720 × 33 frames now lands on caution rather than danger on a 4090; long-clip danger warnings still trigger for genuinely risky configs.

Cache compression for diffusion DiTs (TeaCache)

TeaCache integration with vendored teacache_forward patches under cache_compression/_teacache_patches/.
Five model families wired: FLUX, HunyuanVideo, LTX-Video, CogVideoX, Mochi.
Per-model rescale coefficients pulled from upstream calibration tables.
Quality knob rel_l1_thresh (default 0.4).
TeaCache strategies are filtered out of the LLM RuntimeControls picker via the appliesTo domain field.

Memory + warm pool

Memory-budgeted warm pool + library-only chat picker.
Improved model catalog UX and memory estimates.
Base LLM library RAM estimate on real on-disk size.
Include model footprint in video gen memory estimate.
Fix TurboQuant install detection + warm pool memory guard.
Show release dates and accurate on-disk sizes across model listings.

Diagnostics + Settings UI

New in-app Diagnostics panel under Settings with per-section error fallback.
Reject load_model for models not on disk; disable chat Send until loaded.
Rename dashboard engine label from "No backend" to "Idle".
Split Settings page into logical sections with sub-navigation.
Drop the side-menu Settings layout, always use the tab bar.
Redesign the Storage section and surface resolved paths.
Diagnostics: catch find_spec on missing namespace + scroll body.
Fix Diagnostics panel scroll on tall viewports + long path wrapping.
Tabs-mode sidebar, SubtabBar, and Settings → Appearance toggle.
Group sidebar tabs into Models / Images / Benchmarks / Tools with SVG icons.

Windows pipeline

Ship embedded Python + llama-server in the installer — no source checkout needed at first launch.
Fix Windows restart deadlock.
PowerShell 5.1 hardening — strip non-ASCII, drop &&, avoid backslash-before-paren, strip apostrophes / ampersands.
Stop the video runtime probe timing out on Windows.
Stabilise the Windows dev/build loop and surface CUDA-vs-CPU torch detection.
Unblock Windows video / image studios during first-boot torch import.
Warn on CPU fallback and stop Video Studio stalling on Windows.
Stop build.ps1 dying on git checkout's success message.
Fix LongLive install on Windows.
Real CUDA VRAM detection via torch.cuda (was reporting 12 GB on a 24 GB RTX 4090).
Image-runtime probe re-checks importable packages after a GPU bundle install.
CUDA torch clobber + DLL lock during GPU bundle install fixed (subprocess probe).
Persist installed runtime path across backend restarts; auto-restart when required.
Logical install package counts reported correctly.
GGUF video downloads fetch only the lightweight base pipeline + the selected quantized transformer file.
Windows test + runtime edge cases for directory sizing, library fingerprints, path formatting, optional GGUF import failures.
Tauri 2.10.3 → 2.11.0; tauri-build 2.5.6 → 2.6.0 alignment.

macOS Python framework

Switched to python-build-standalone (Astral) so @loader_path / @rpath references are baked in — avoids the install_name_tool regression with Xcode 16.4 + actions/setup-python that previously crashed at launch with Library not loaded: Python.framework.
Persistent extras dir namespaced by Python ABI ...

Assets 13

18 Apr 16:31

github-actions

v0.5.2

d041efa

ChaosEngineAI v0.5.2

ChaosEngineAI v0.5.2 packages everything that landed since the first v0.5.0 release, with a focus on stability, inference improvements, release hardening, and cleaner operator-facing behavior.

Highlights

Improved backend and engine reliability with fixes for backend load failures, auth token cache poisoning, reasoning split handling, MLX profile application, DFlash resolution, engine issues, and stale orphaned-worker notifications.
Expanded inference functionality with a CLI inference runner, broader inference and model updates, benchmark model-selection and caching fixes, and additional inference test coverage.
Hardened release and tooling workflows with cross-platform updater artifact fixes, stronger sidecar/local auth surfaces, Image Studio installed-model filtering, manual-dispatch CI build gating, documentation/discovery updates, and release workflow improvements.

Commit Summary Since `v0.5.0`

1d50fb5 Initial release.
0e63748 Fixed reasoning split, MLX profile application, and DFlash resolution.
245cddf Applied broader binary-related fixes and cleanup.
6f0a158 Fixed engine issues.
8138b18 Added inference tests.
b030283 Added the CLI inference runner.
b39c0bf Improved inference flows and related tests.
e9039fc Updated models.
b70a10b Fixed test modules.
f3d2c3e Fixed benchmark model selection, caching strategies, and DFlash behavior.
d4c00b8 Improved discovery behavior and updated the README.
db54bd4 Addressed the Copilot licensing/auth error path.
7eef544 Hardened sidecar auth and local tool surfaces.
a189863 Hardened the release workflow during the v0.5.1 bump.
a9d619a Fixed cross-platform updater release artifacts.
3100386 Bumped to v0.5.2 and fixed backend load failures plus auth token cache poisoning.
78a729c Limited Image Studio to installed models and made the CI build manual-dispatch only.
d041efa Stopped stale or false-positive orphaned-worker notifications.

Assets 12

15 Apr 19:21

github-actions

v0.5.0

b70a10b

ChaosEngineAI Launch v0.5.0

Welcome to ChaosEngineAI

Assets 11

Releases: cryptopoly/ChaosEngineAI

ChaosEngineAI v0.7.4

v0.7.4 — chat uplift + image/video gen polish

Chat experience (the headline)

Image generation

Video generation

CUDA quantization (Windows / Linux foundation)

Speculative decoding

Windows / CUDA stability

Studio polish

Test infrastructure

Uh oh!

ChaosEngineAI v0.7.2

Highlights

Smoke-test fixes (post-tag, in this rebuild)

Video Generation

New: Video Studio

New: Video catalog

New: Apple Silicon mlx-video LTX-2 engine

New: stable-diffusion.cpp engine scaffold (cross-platform)

New: LongLive scaffold (CUDA only)

Quality + safety

Cache compression for diffusion DiTs (TeaCache)

Memory + warm pool

Diagnostics + Settings UI

Windows pipeline

macOS Python framework

Uh oh!

ChaosEngineAI v0.5.2

Highlights

Commit Summary Since v0.5.0

Uh oh!

ChaosEngineAI Launch v0.5.0

Uh oh!

Commit Summary Since `v0.5.0`