Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,20 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

## [Unreleased]

### Changed

- build: add `profiling` and `sandbox` to default Cargo features — tracing spans are compiled by
default for diagnostics; macOS Seatbelt / Linux Landlock sandbox is available without
`--features sandbox` (still runtime-disabled unless `tools.sandbox.enabled = true`)
- build: consolidate `self-check`, `env-vault`, and `task-metrics` as always-on — these were pure
behavioral markers with no optional deps, violating the feature flag decision rule (spec 029 §2)

### Fixed

- docs(specs): update spec 001 §9 and spec 029 §3.1/§4/§5.3 to reflect actual default feature set
(was documenting `default = ["scheduler", "sqlite"]` since v0.18 while reality had 5 features
since v0.20)

## [0.20.2] - 2026-05-06

### Added
Expand Down
7 changes: 2 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -195,19 +195,17 @@ description = "Lightweight AI agent with hybrid inference, skills-first architec
readme = "README.md"

[features]
default = ["scheduler", "sqlite", "task-metrics", "self-check", "env-vault"]
default = ["scheduler", "sqlite"]

# === Use-case bundles ===
desktop = ["tui"]
ide = ["acp", "acp-http"]
server = ["gateway", "a2a", "otel", "prometheus"]
chat = ["discord", "slack"]
ml = ["candle", "pdf"]
full = ["desktop", "ide", "server", "chat", "pdf", "scheduler", "classifiers", "profiling", "task-metrics", "sandbox", "self-check", "gonka", "testing"]
full = ["desktop", "ide", "server", "chat", "pdf", "scheduler", "classifiers", "profiling", "sandbox", "gonka", "sqlite", "testing"]
testing = ["zeph-llm/testing"]
env-vault = ["zeph-core/env-vault"]
sandbox = ["zeph-tools/sandbox"]
self-check = ["zeph-core/self-check"]
bench = ["dep:zeph-bench"]

# === Individual feature flags ===
Expand Down Expand Up @@ -242,7 +240,6 @@ profiling = [
]
profiling-alloc = ["profiling", "zeph-core/profiling-alloc"]
profiling-pyroscope = ["profiling", "otel", "dep:pprof"]
task-metrics = ["zeph-core/task-metrics"]
# Database backend selection — mutually exclusive. Default includes sqlite.
# NOTE: --all-features activates both, triggering compile_error! in zeph-db.
# Use --features full or --features full,postgres instead.
Expand Down
8 changes: 4 additions & 4 deletions book/src/advanced/observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,18 +93,18 @@ Zeph uses a `TaskSupervisor` to manage background tasks (embedding, memory conso

### Enabling Task Metrics

Enable the optional `task-metrics` feature (included in `full`):
Task metrics compile unconditionally — no feature flag needed. Build normally:

```bash
cargo build --release --features task-metrics
cargo build --release
```

When enabled, each supervised task records:
Each supervised task records:

- **Wall-time**: elapsed time from spawn to completion
- **CPU-time**: actual CPU cycles spent (OS-level thread time measurement)

Zero overhead when disabled — the feature gate compiles out the measurement code.
Note: the `task-metrics` feature flag was consolidated as always-on in v0.20.x.

### Viewing Task Metrics

Expand Down
2 changes: 1 addition & 1 deletion book/src/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

### Added

- **TaskSupervisor observability** — CPU and wall-time metrics for supervised tasks, visible in Jaeger traces and tokio-console. Optional `task-metrics` feature (included in `full`). See [Observability & Cost](advanced/observability.md#task-supervisor-metrics).
- **TaskSupervisor observability** — CPU and wall-time metrics for supervised tasks, visible in Jaeger traces and tokio-console. See [Observability & Cost](advanced/observability.md#task-supervisor-metrics). (The `task-metrics` feature flag was consolidated as always-on in v0.20.x — no feature flag required.)
- **TUI task registry panel** — new `/tasks` command displays live table of all supervised tasks (name, state, uptime, restart count). See [TUI Dashboard](advanced/tui.md#command-palette).
- **Per-chunk code indexing supervision** — `CodeIndexer` now integrates with `TaskSupervisor` for fine-grained visibility of concurrent embedding tasks. Each chunk operation is registered as a separate task (`chunk_file_{N}`) in the supervisor registry.
- **Bootstrap TaskSupervisor migration** — 7 memory background loops (eviction, tier promotion, consolidation, forgetting, compression, tree consolidation) migrated to `TaskSupervisor` with restart policies.
Expand Down
4 changes: 0 additions & 4 deletions crates/zeph-agent-context/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,7 @@ zeph-sanitizer.workspace = true
zeph-skills.workspace = true

[features]
# `default = []` — opt-in only. Both features must be enabled explicitly.
default = []
# Gates the retrieved-memory mirror types and the `quality` field on view structs.
# Forward this feature to `zeph-context` when needed.
self-check = []
# Enables `zeph-index` integration via `IndexAccess` in context assembly views.
index = ["dep:zeph-index"]

Expand Down
6 changes: 3 additions & 3 deletions crates/zeph-agent-context/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,13 +86,13 @@ let window = MessageWindowView {

| Feature | Default | Description |
|---|---|---|
| `self-check` | off | Retrieved-memory mirror types for the MARCH self-check pipeline |
| `index` | off | `zeph-index` integration via `IndexAccess` in assembly views |

Both features must be enabled explicitly. `zeph-core` enables them where needed.
The `self-check` feature was consolidated as always-on in v0.20.x — retrieved-memory mirror types
compile unconditionally. Only `index` remains optional.

```toml
zeph-agent-context = { version = "0.20", workspace = true, features = ["self-check", "index"] }
zeph-agent-context = { version = "0.20", workspace = true, features = ["index"] }
```

## License
Expand Down
5 changes: 0 additions & 5 deletions crates/zeph-agent-context/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,11 @@
//!
//! # Features
//!
//! - `self-check` — gates retrieved-memory mirror types for the MARCH self-check pipeline.
//! - `index` — enables `zeph-index` integration via the `IndexAccess` trait.

pub mod compaction;
pub mod error;
pub mod helpers;
#[cfg(feature = "self-check")]
#[cfg_attr(docsrs, doc(cfg(feature = "self-check")))]
pub mod retrieved;
pub mod service;
pub mod state;
Expand All @@ -53,6 +50,4 @@ pub use state::{
ToolOutputArchive, TrustGate,
};

#[cfg(feature = "self-check")]
#[cfg_attr(docsrs, doc(cfg(feature = "self-check")))]
pub use retrieved::{RetrievedContext, collect_retrieved_context};
2 changes: 0 additions & 2 deletions crates/zeph-agent-context/src/retrieved.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@
//! (recall, graph facts, cross-session, summaries) for one turn.
//! [`collect_retrieved_context`] walks the turn's message list and populates
//! the four buckets without allocating beyond the [`Vec`]s themselves.
#![cfg_attr(docsrs, doc(cfg(feature = "self-check")))]

use zeph_llm::provider::{Message, MessagePart, Role};

use crate::helpers::{CROSS_SESSION_PREFIX, GRAPH_FACTS_PREFIX, RECALL_PREFIX, SUMMARY_PREFIX};
Expand Down
5 changes: 2 additions & 3 deletions crates/zeph-common/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ description = "Shared utility functions and security primitives for Zeph crates"
readme = "README.md"

[features]
task-metrics = ["dep:cpu-time", "dep:metrics"]
treesitter = [
"dep:tree-sitter",
"dep:tree-sitter-go",
Expand All @@ -25,8 +24,8 @@ treesitter = [

[dependencies]
blake3.workspace = true
cpu-time = { workspace = true, optional = true }
metrics = { workspace = true, optional = true }
cpu-time.workspace = true
metrics.workspace = true
parking_lot.workspace = true
serde = { workspace = true, features = ["derive", "rc"] }
serde_json.workspace = true
Expand Down
32 changes: 4 additions & 28 deletions crates/zeph-common/src/task_supervisor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -253,13 +253,10 @@ pub enum TaskStatus {
/// | Field | tokio-console | Jaeger / OTLP | TUI | `metrics` histogram |
/// |-------|--------------|--------------|-----|---------------------|
/// | `name` | span name | span name | task list | label `"task"` |
/// | `task.wall_time_ms` | — | span field (`task-metrics`) | — | `zeph.task.wall_time_ms` |
/// | `task.cpu_time_ms` | — | span field (`task-metrics`) | — | `zeph.task.cpu_time_ms` |
/// | `task.wall_time_ms` | — | span field | — | `zeph.task.wall_time_ms` |
/// | `task.cpu_time_ms` | — | span field | — | `zeph.task.cpu_time_ms` |
/// | `status` | — | — | task list | — |
/// | `restart_count` | — | — | task list | — |
///
/// The `task.wall_time_ms` and `task.cpu_time_ms` fields are only populated when
/// the crate is compiled with the `task-metrics` feature.
pub struct TaskSnapshot {
/// Task name.
pub name: Arc<str>,
Expand Down Expand Up @@ -511,15 +508,12 @@ impl TaskSupervisor {
R: Send + 'static,
{
let (tx, rx) = oneshot::channel::<Result<R, BlockingError>>();
#[cfg(feature = "task-metrics")]
let span = tracing::info_span!(
"supervised_blocking_task",
task.name = %name,
task.wall_time_ms = tracing::field::Empty,
task.cpu_time_ms = tracing::field::Empty,
);
#[cfg(not(feature = "task-metrics"))]
let span = tracing::info_span!("supervised_blocking_task", task.name = %name);

let semaphore = Arc::clone(&self.inner.blocking_semaphore);
let inner = Arc::clone(&self.inner);
Expand Down Expand Up @@ -1042,11 +1036,7 @@ impl TaskSupervisor {

// ── Task metrics helpers ──────────────────────────────────────────────────────

/// Run `f` and record wall-time and CPU-time metrics when `task-metrics` is enabled.
///
/// When the feature is disabled this is a zero-overhead identity wrapper —
/// no `cpu-time` or `metrics` crates are linked.
#[cfg(feature = "task-metrics")]
/// Run `f` and record wall-time and CPU-time metrics via `metrics` crate.
#[inline]
fn measure_blocking<F, R>(name: &str, f: F) -> R
where
Expand All @@ -1065,18 +1055,6 @@ where
result
}

/// Identity wrapper when `task-metrics` feature is disabled.
///
/// Compiles to a direct call to `f()` with no overhead.
#[cfg(not(feature = "task-metrics"))]
#[inline]
fn measure_blocking<F, R>(_name: &str, f: F) -> R
where
F: FnOnce() -> R,
{
f()
}

// ── BlockingSpawner impl ──────────────────────────────────────────────────────

impl BlockingSpawner for TaskSupervisor {
Expand Down Expand Up @@ -1691,13 +1669,11 @@ mod tests {
handle.await.expect("task should complete");
}

/// Verify that `measure_blocking` emits wall-time and CPU-time histograms when
/// the `task-metrics` feature is enabled.
/// Verify that `measure_blocking` emits wall-time and CPU-time histograms.
///
/// `measure_blocking` calls `metrics::histogram!` on the current thread.
/// We test it directly using a `DebuggingRecorder` installed as the thread-local
/// recorder via `metrics::with_local_recorder`.
#[cfg(feature = "task-metrics")]
#[test]
fn test_measure_blocking_emits_metrics() {
use metrics_util::debugging::DebuggingRecorder;
Expand Down
17 changes: 7 additions & 10 deletions crates/zeph-core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,23 +13,20 @@ description = "Core agent loop, configuration, context builder, metrics, and vau
readme = "README.md"

[features]
profiling = ["dep:tracing-subscriber"]
profiling-alloc = ["profiling"]
sysinfo = ["dep:sysinfo"]
task-metrics = ["dep:cpu-time", "dep:metrics", "zeph-common/task-metrics"]
default = ["sqlite"]
candle = ["zeph-llm/candle"]
classifiers = ["zeph-llm/classifiers", "zeph-sanitizer/classifiers"]
gonka = ["zeph-llm/gonka"]
cuda = ["zeph-llm/cuda"]
env-vault = ["zeph-vault/env-vault"]
gonka = ["zeph-llm/gonka"]
index = ["zeph-agent-context/index"]
metal = ["zeph-llm/metal"]
mock = ["zeph-vault/mock"]
postgres = ["zeph-db/postgres"]
profiling = ["dep:tracing-subscriber"]
profiling-alloc = ["profiling"]
scheduler = []
index = ["zeph-agent-context/index"]
self-check = ["zeph-agent-context/self-check"]
sqlite = ["zeph-db/sqlite"]
sysinfo = ["dep:sysinfo"]

[dependencies]
base64.workspace = true
Expand Down Expand Up @@ -61,8 +58,8 @@ toml_edit.workspace = true
tracing.workspace = true
tracing-subscriber = { workspace = true, features = ["parking_lot", "registry"], optional = true }
sysinfo = { workspace = true, optional = true }
cpu-time = { workspace = true, optional = true }
metrics = { workspace = true, optional = true }
cpu-time.workspace = true
metrics.workspace = true
tree-sitter.workspace = true
uuid = { workspace = true, features = ["v4", "serde"] }
zeroize.workspace = true
Expand Down
3 changes: 0 additions & 3 deletions crates/zeph-core/src/agent/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2096,8 +2096,6 @@ impl<C: Channel> Agent<C> {
/// response and appends a flag marker to the channel output if assertions are contradicted
/// or unsupported by retrieved evidence.
///
/// Calling this method without the `self-check` feature compiled in is a no-op.
///
/// # Examples
///
/// ```no_run
Expand All @@ -2109,7 +2107,6 @@ impl<C: Channel> Agent<C> {
/// // agent_builder.with_quality_pipeline(Some(pipeline));
/// ```
#[must_use]
#[cfg(feature = "self-check")]
pub fn with_quality_pipeline(
mut self,
pipeline: Option<std::sync::Arc<crate::quality::SelfCheckPipeline>>,
Expand Down
11 changes: 1 addition & 10 deletions crates/zeph-core/src/agent/memcot/metrics.rs
Original file line number Diff line number Diff line change
@@ -1,29 +1,23 @@
// SPDX-FileCopyrightText: 2026 Andrei G <bug-ops>
// SPDX-License-Identifier: MIT OR Apache-2.0

//! Feature-gated metrics helpers for the `MemCoT` distillation pipeline.
//!
//! All public functions are no-ops when the `task-metrics` feature is disabled.
//! This pattern matches the existing `agent_supervisor.rs` convention.
//! Metrics helpers for the `MemCoT` distillation pipeline.

/// Increment the `memcot_distill_total` counter.
#[inline]
pub fn distill_total() {
#[cfg(feature = "task-metrics")]
metrics::counter!("memcot_distill_total").increment(1);
}

/// Increment the `memcot_distill_timeout_total` counter.
#[inline]
pub fn distill_timeout() {
#[cfg(feature = "task-metrics")]
metrics::counter!("memcot_distill_timeout_total").increment(1);
}

/// Increment the `memcot_distill_error_total` counter.
#[inline]
pub fn distill_error() {
#[cfg(feature = "task-metrics")]
metrics::counter!("memcot_distill_error_total").increment(1);
}

Expand All @@ -32,8 +26,5 @@ pub fn distill_error() {
/// `reason` should be `"interval"` or `"session_cap"`.
#[inline]
pub fn distill_skipped(reason: &'static str) {
#[cfg(feature = "task-metrics")]
metrics::counter!("memcot_distill_skipped_total", "reason" => reason).increment(1);
#[cfg(not(feature = "task-metrics"))]
let _ = reason;
}
3 changes: 0 additions & 3 deletions crates/zeph-core/src/agent/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ mod persistence;
mod plan;
mod policy_commands;
mod provider_cmd;
#[cfg(feature = "self-check")]
mod quality_hook;
pub(crate) mod rate_limiter;
#[cfg(feature = "scheduler")]
Expand Down Expand Up @@ -281,7 +280,6 @@ impl<C: Channel> Agent<C> {
sidequest: sidequest::SidequestState::default(),
tool_state: ToolState::default(),
goal_accounting: None,
#[cfg(feature = "self-check")]
quality: None,
proactive_explorer: None,
promotion_engine: None,
Expand Down Expand Up @@ -1606,7 +1604,6 @@ impl<C: Channel> Agent<C> {
tracing::debug!("turn timing: process_response done");

// MARCH self-check hook: runs after every successful response, including cache-hit path.
#[cfg(feature = "self-check")]
if let Some(pipeline) = self.services.quality.clone() {
self.run_self_check_for_turn(pipeline, turn.id().0).await;
}
Expand Down
1 change: 0 additions & 1 deletion crates/zeph-core/src/agent/state/services.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ pub(crate) struct Services {
pub(crate) goal_accounting: Option<std::sync::Arc<crate::goal::GoalAccounting>>,

/// MARCH self-check pipeline, built at startup and rebuilt on provider swap.
#[cfg(feature = "self-check")]
pub(crate) quality: Option<std::sync::Arc<crate::quality::SelfCheckPipeline>>,
/// Proactive world-knowledge explorer (#3320).
///
Expand Down
9 changes: 3 additions & 6 deletions crates/zeph-core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,6 @@ pub mod notifications;
pub mod pipeline;
pub mod project;
pub mod provider_factory;
#[cfg(feature = "self-check")]
pub mod quality;
pub mod redact;
#[cfg(feature = "sysinfo")]
Expand Down Expand Up @@ -137,15 +136,13 @@ pub mod vault {
default_vault_dir,
};

/// Environment-variable backed vault provider, available only when the
/// `env-vault` feature is enabled.
/// Environment-variable backed vault provider.
///
/// # Security
///
/// This provider reads secrets from process environment variables and is
/// intended **exclusively for development and testing**. Never enable this
/// feature in production builds. Use [`AgeVaultProvider`] instead.
#[cfg(feature = "env-vault")]
/// intended **exclusively for development and testing**. Never use in
/// production builds. Use [`AgeVaultProvider`] instead.
pub use zeph_vault::EnvVaultProvider;

#[cfg(any(test, feature = "mock"))]
Expand Down
Loading
Loading