Introduce a unified config/composition layer for all agent components

## Summary

The agent today wires components together ad-hoc. Each component reaches into globals (`asyncio`, env vars, module-level singletons, `nest_asyncio.apply()`, hard-coded paths, hard-coded timeouts) and makes its own assumptions about the OS, Python version, and installed packages. That coupling makes the whole system fragile: a single environment mismatch (Python 3.14 vs `nest_asyncio`) can break unrelated subsystems with no clear signal, and any per-deployment change requires code edits across many files.

I want to move the agent to a **composition-based architecture with a unified config layer**: every component is a pure, version-agnostic, OS-agnostic unit that receives a config object at construction time. All environmental decisions - Python version checks, platform detection, package availability, feature flags, timeouts, paths, credentials, model selection, logging - are resolved by the config layer at a **single entry point**, not scattered inside the components.

The `asyncio.wait_for` / `nest_asyncio` / Python 3.14 bug (see compat shim at the top of `agent_core/core/impl/action/manager.py`) is the concrete example that forced this issue, but it's a symptom, not the problem. The same class of fragility applies to MCP setup, LLM provider switching, sandboxed action execution, scheduler wiring, interface mode (browser/cli/tui), and more.

## Context: how we got here

Frankie hit a blocker on Python 3.14.x where every `asyncio.wait_for(...)` call raised `RuntimeError: Timeout should be used inside a task`, breaking MCP stdio startup and action execution. Root cause: `nest_asyncio.apply()` doesn't propagate Python 3.14's task context variable, so `asyncio.timeout()` can't find the current task.

Debugging was painful because:
- No Python version is recorded in logs.
- The trigger consumer swallowed the failure silently (`except Exception: pass` with no log), so the agent looked dead with zero signal.
- The traceback surfaces inside stdlib with no hint that `nest_asyncio` is involved.
- `nest_asyncio` itself is only needed because ~10 places in the codebase call `asyncio.run()` / `loop.run_until_complete()` from inside an already-running event loop.

I've shipped a **band-aid shim** that monkey-patches `asyncio.wait_for`. It works today but silently rewrites a stdlib function, swallows `BaseException` during cleanup, and hides the real architectural problem. It needs to go away as part of this refactor.

## The proposal

### 1. Components become pure + version-agnostic

Every component - trigger consumer, action manager, action executor, MCP client, LLM interface, memory manager, scheduler, external comms, UI adapters, state manager - is rewritten to:
- Take all of its dependencies via constructor / DI.
- Hold no module-level globals, no `nest_asyncio.apply()`, no direct env-var reads.
- Be testable in isolation without spinning up the whole agent.
- Not care about OS, Python version, or package availability.

### 2. Unified config layer

A single `AgentConfig` (or similar) object owns *everything* environmental:
- Python version + runtime capability checks (does `asyncio.timeout` work? do we need a shim? is `nest_asyncio` needed?).
- Platform detection (win32/darwin/linux branching).
- Package availability probes (Node/npm for MCP, tesseract, playwright, etc.).
- Paths (data dir, chroma dir, agent FS, workspace root).
- Timeouts, retry budgets, rate limits.
- LLM providers, models, API keys, base URLs.
- Feature flags (gui_mode, slow_mode, experimental toggles).
- Interface mode and adapter selection.
- Logging setup (level, sinks, format).

Config is built once at startup from `settings.json` + CLI args + env detection, then handed to the composition root.

### 3. Single composition entry point

One place - likely a replacement/extension of `app/main.py::main_async` - builds the config, instantiates every component with it, wires them together, and hands control to the interface. No component constructs its dependencies itself; they all come from the composition root.

This is where version-specific workarounds live - exactly once - gated by the config's capability flags. The `asyncio.wait_for` shim, for example, becomes `if config.needs_wait_for_shim: install_shim()` at the composition root and nowhere else.

### 4. Eliminate nest_asyncio

As part of the component rewrite, every `asyncio.run()` / `loop.run_until_complete()` call inside a running loop (`app/data/action/task_end.py`, `app/data/action/send_message_with_attachment.py`, `app/data/action/integration_management.py`, `agent_core/core/impl/llm/interface.py:319`, `agent_core/core/impl/config/watcher.py:240`, `agent_core/core/impl/skill/manager.py:90`, and others) is converted to proper `await` / `asyncio.create_task` / `asyncio.to_thread` patterns. Once those are gone, `nest_asyncio` can be dropped from `requirements.txt` / `environment.yml` - and with it, the shim.

## Wins

- **One place** to do environment/version checks instead of scattered runtime surprises.
- **Swap-ability**: changing LLM provider, interface mode, or storage backend is a config change, not a code change.
- **Testability**: every component can be unit-tested with a fake config.
- **Debuggability**: config object dump at startup = full picture of the runtime environment, no more guessing Python versions from tracebacks.
- **Portability**: same component code runs on 3.10, 3.11, 3.12, 3.13, 3.14, and whatever comes next - only the config resolver changes.
- **The `nest_asyncio` / `asyncio.wait_for` bug disappears** as a free side-effect, not as a targeted fix.

## Scope / phasing suggestion

This is a multi-week refactor, not a weekend PR. Rough phasing:

1. **Phase 0 - diagnostics (already partially done)**: log Python version at startup, log trigger consumer exits, log component init.
2. **Phase 1 - config skeleton**: define `AgentConfig` schema, build it from `settings.json` + env + CLI at a single point, pass it down. No component rewrites yet - just make sure everything flows through one config object.
3. **Phase 2 - remove `asyncio.run()` inside running loop**: convert the ~10 offending call sites to proper async. Drop `nest_asyncio` + shim.
4. **Phase 3 - component extraction**: one subsystem at a time (start with action executor or LLM interface), move globals → constructor args, wire via composition root.
5. **Phase 4 - docs**: `docs/architecture.md` describing components + config + composition root.

## Out of scope

- Behavioral changes to the agent itself. This is purely structural.
- Breaking the public config surface (`settings.json` schema can evolve but shouldn't break existing users in phase 1–2).
- Maybe later move all the configs to be dependant on models rather than json. Example: the onboarding configs and settings json structure shouldn't depend on the file. Would allow to create them if they are missing instead of the agent just crashing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a unified config/composition layer for all agent components #224

Summary

Context: how we got here

The proposal

1. Components become pure + version-agnostic

2. Unified config layer

3. Single composition entry point

4. Eliminate nest_asyncio

Wins

Scope / phasing suggestion

Out of scope

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Introduce a unified config/composition layer for all agent components #224

Description

Summary

Context: how we got here

The proposal

1. Components become pure + version-agnostic

2. Unified config layer

3. Single composition entry point

4. Eliminate nest_asyncio

Wins

Scope / phasing suggestion

Out of scope

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions