add file upload support for agent debug mode by jeffwu-1999 · Pull Request #3300 · ModelEngine-Group/nexent

jeffwu-1999 · 2026-06-25T12:04:45Z

Add file attachment upload/preview/remove UI in debug panel
Upload files to MinIO and pass minio_files in agent run params
Support file attachments in both debug and compare modes
Include attachment info in conversation history
Update data_process_service to return img_info alongside chunks
Make object_name/presigned_url optional in conversationService types

for example:

* ✨Feat:add aidp search tool * 🗑️ Remove: Delete the standalone AIDP mock server implementation from the project. * 🐛Bugfix: Update AIDP API endpoint parameters and enhance error logging * 🔧 Refactor: Implement autouse fixture for supabase mock to ensure structured attributes are preserved during test execution * 🔧 Refactor: Enhance stubbing of file management service in tests to ensure compatibility with LLM model retrieval and configuration management * 🐛 Fix stub for file_management_service: look up patched names from sys.modules The previous stub captured `backend_file_management_module` (the stub itself) in `_stub_get_llm_model`, so `@patch` decorators modifying `sys.modules['backend.services.file_management_service']` were never visible. This caused `TestGetLlmModel` tests to return an unpached MagicMock instead of the expected mock_model_instance. Two changes: 1. `_stub_get_llm_model` now looks up all dependencies from `sys.modules['backend.services.file_management_service']` so that runtime patches from `@patch(...)` decorators are respected. 2. The stub module provides MagicMock defaults for all attributes that `@patch` needs to call `get_original()` on (tenant_config_manager etc.). * 🔧 Refactor: Update test_get_llm_model to improve patching and ensure consistent behavior across environments. Simplified test structure by directly patching `get_llm_model` and its dependencies, enhancing clarity and reliability of test cases.

* 🐛 Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field (#3246) * Move non-shadcn ui component to other folder * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix inability to select agent from agent space to edit * Bugfix: Display correct version info when viewing agent details * Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field * 补充sql (#3248) * 补充sql * 扩大limit限制 * 🐛 Bugfix: Fixed an issue where the MCP service failed to start in a Kubernetes container. (#3254) [Specification Details] 1. Modify the pod naming logic to convert all non-compliant characters to -. 2. Modify test cases. * 🐛 Bugfix: knowledge_base_search_tool called with TypeError: argument of type 'FieldInfo' is not iterable (#3259) * 🐛 Bugfix: Fixed an issue where the one-click rename function failed after importing an agent. (#3258) [Specification Details] 1. The frontend does not pass `agent_id` when calling the `regenerate_name` API. * Bugfix: Exclude attachments from assistant when saving conversation history (#3261) * Bump APP_VERSION from v2.2.0 to v2.2.1 (#3268) The default setting for client-side self-validation is "False". --------- Co-authored-by: xuyaqi <xuyaqist@gmail.com> Co-authored-by: hhhhsc701 <56435672+hhhhsc701@users.noreply.github.com> Co-authored-by: Xia Yichen <iamjasonxia@126.com>

This reverts commit 20af495.

* 🐛 Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field (#3246) * Move non-shadcn ui component to other folder * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix inability to select agent from agent space to edit * Bugfix: Display correct version info when viewing agent details * Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field * 补充sql (#3248) * 补充sql * 扩大limit限制 * 🐛 Bugfix: Fixed an issue where the MCP service failed to start in a Kubernetes container. (#3254) [Specification Details] 1. Modify the pod naming logic to convert all non-compliant characters to -. 2. Modify test cases. * 🐛 Bugfix: knowledge_base_search_tool called with TypeError: argument of type 'FieldInfo' is not iterable (#3259) * 🐛 Bugfix: Fixed an issue where the one-click rename function failed after importing an agent. (#3258) [Specification Details] 1. The frontend does not pass `agent_id` when calling the `regenerate_name` API. * Bugfix: Exclude attachments from assistant when saving conversation history (#3261) * Bump APP_VERSION from v2.2.0 to v2.2.1 (#3268) The default setting for client-side self-validation is "False". --------- Co-authored-by: xuyaqi <xuyaqist@gmail.com> Co-authored-by: hhhhsc701 <56435672+hhhhsc701@users.noreply.github.com> Co-authored-by: Xia Yichen <iamjasonxia@126.com>

* 111 * issue_solve * testcase_fix * test_fix * Remove unrelated unstructured filename metadata change

…3285) * fix: parallel unit test runner with file-level subprocess isolation - Rewrite test/run_all_test.py as file-level parallel runner using ThreadPoolExecutor with configurable workers (NEXENT_PYTEST_WORKERS) and per-file timeout (NEXENT_PYTEST_FILE_TIMEOUT) - Add pytest-xdist to backend test extras - Fix test_mcp_service.py: clear proxy env vars (socks://) in fixture to prevent httpx.AsyncClient ValueError - Fix test_remote_mcp_service.py: mock check_runtime_host_port_available to prevent port conflict in container enable test - Fix test_openai_llm.py: reduce memory leak from repeated module imports - Update CI workflow: default to parallel mode, add dispatch inputs for worker count and per-file timeout Serial: 229/229 pass (7m7s). Parallel: 229/229 pass (1m1s, ~7x speedup). * chore: remove unused pytest-xdist dependency The parallel runner uses ThreadPoolExecutor with per-file subprocess isolation, not pytest-xdist. The xdist package was added but never used due to sys.modules mock conflicts during pytest collection. --------- Co-authored-by: Jinglong Wang <wangjinglong8@huawei.com>

…creation should be "False" (#3284)

* Move non-shadcn ui component to other folder * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix inability to select agent from agent space to edit * Bugfix: Display correct version info when viewing agent details * Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field * Refactor: update left navigation menu * 删除快速配置页面 * 删除注释 * 更新i18n

… when publishing an agent version. (#3287)

* 🐛 Bugfix: Update HTTP client settings to increase timeout and disable SSL verification in aidp_service and aidp_search_tool (#3280) * 🐛 Bugfix: Fix page show

…3209) * fix: resolve skills not exposed to agents and LogLevel enum errors - Fix LogLevel.WARNING AttributeError by replacing with LogLevel.ERROR (smolagents LogLevel enum only has OFF/ERROR/INFO/DEBUG, no WARNING) at core_agent.py lines 417 and 804 - Increase skills token budget from 1000 to 4000 in summary_config.py to accommodate the verbose 6-step skill usage process (~2500-3500 chars) that was being silently dropped by TokenBudgetStrategy - Add skills sections to English prompt templates (manager + managed) mirroring the Chinese template structure with <available_skills> block and skill usage requirements section - Add diagnostic logging in create_agent_info.py and core_agent.py to track skills count and component assembly for debugging - Improve exception handling in _get_skills_for_template() with ERROR level logging and full stack trace for better observability - Add comprehensive test suite (test_context_component_types.py) with 38 tests covering component types, assembly validation, and semantic equivalence between Jinja2 templates and component assembly path All 104 tests pass (38 backend + 66 SDK), zero regressions. * fix: resolve dual ContextManager bug and enable context manager by default - Add atomic replace_components() method to ContextManager to prevent race conditions when swapping components on conversation-level CM - Fix run_agent.py to re-register components on surviving CM after overwrite (both MCP and non-MCP paths) - Guard CM creation in nexent_agent.py with enabled check to avoid creating useless CM when context management is disabled - Change enable_context_manager default from False to True - Fix numbering consistency: tools and skills always show 1./3. prefix - Fix indentation in manager_system_prompt_template_en.yaml (6→5 spaces) - Add tests for replace_components() and component survival after overwrite * fix: remove invalid time_str arg and deduplicate test helpers Remove time_str keyword argument from 12 test calls that caused TypeError since build_context_components() and build_skeleton_header_component() do not accept this parameter. Extract shared mock classes (_MockTool, _MockManagedAgent, _MockExternalAgent) to module level and introduce _base_kwargs() and _full_kwargs() helpers to eliminate duplicated blocks, reducing SonarCloud duplication density below the quality gate.

* Doc: Add design for upgrading context management in nexent with 16 works to do. * docs: complete context management production review * feat(W1): add type skeleton for ModelCapacityResolver and tokenizer registry Introduces the contract surface for W1 (Correct Model Token-Capacity Configuration) so W2/W3 development can begin against stable types. No runtime behaviour change — resolver/registry implementations land in the follow-up PR. New modules: - sdk/nexent/core/models/capacity_resolver.py: CapabilityProfile and ModelCapacitySnapshot (Pydantic v2, frozen), typed ResolverError hierarchy, compute_fingerprint() implementing the SHA-256/canonical-JSON contract from W1 ADR Decision 3, RESOLVER_VERSION constant, and a resolve_capacity() stub. - sdk/nexent/core/models/tokenizer_registry.py: TokenizerAdapter Protocol, empty REGISTRY, FallbackEstimator (char/4 heuristic that always returns counting_mode='estimated'), and resolve() function. Family-name validation pattern enforces the naming convention fixed in the ADR. - backend/consts/capability_profiles.py: CATALOG with eight approved day-one entries (openai/gpt-4o, openai/gpt-4.1, dashscope/qwen-plus, qwen-turbo, glm-5.1, silicon DeepSeek-V4-Flash, Qwen3.6-27B, Kimi-K2.6) plus CATALOG_REVISION. Design reference: doc/working/context-management-workstreams/ W1_ADR_Capability_Catalog_Storage_and_Fingerprint.md (locally hosted; team sharing channel separate from this repo per doc/.gitignore policy). Smoke-tested: fingerprint is deterministic and order-independent across unknown_capabilities and field_sources; ModelCapacitySnapshot rejects mutation; tokenizer resolve() falls back to estimated for unknown families; resolve_capacity stub raises NotImplementedError; CATALOG imports cleanly with all 8 entries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(W1): add capacity columns to model_record_t (additive migration) Adds seven nullable capacity fields to model_record_t so the ModelCapacityResolver can read operator overrides per W1 ADR: - context_window_tokens - max_input_tokens - max_output_tokens - default_output_reserve_tokens - tokenizer_family - capacity_source - capability_profile_version All columns are nullable, no defaults that change semantics. Legacy max_tokens is left untouched and continues to behave as a deprecated output-cap alias until consumers migrate (separate follow-up). Touchpoints: - docker/sql/v2.2.0_0615_add_capacity_fields_to_model_record_t.sql: idempotent upgrade with ALTER TABLE ... ADD COLUMN IF NOT EXISTS + COMMENT ON COLUMN. - docker/init.sql: fresh-install CREATE TABLE inline plus COMMENT ON COLUMN. - k8s/helm/nexent/charts/nexent-common/files/init.sql: same for k8s deploys. - backend/database/db_models.py: ModelRecord ORM columns. - backend/consts/model.py: ModelRequest Pydantic schema fields so CRUD round-trips the new values. Design reference: doc/working/context-management-workstreams/ W1_ADR_Capability_Catalog_Storage_and_Fingerprint.md (Decision 1, schema). Verification: - ORM exposes all 7 columns - Pydantic ModelRequest exposes all 7 fields - All three SQL files contain 14 occurrences (column + COMMENT per field) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: move W1 ADR to dedicated ADRs directory Move W1_ADR_Capability_Catalog_Storage_and_Fingerprint.md from context-management-workstreams to context-management-workstream/ADRs for better organization. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * feat(W1): implement resolve_capacity with catalog + operator override Replaces the resolve_capacity NotImplementedError stub with the real ModelCapacityResolver per W1 ADR. The resolver: - Looks up the (provider, model_name) entry in the capability profile catalog passed by the caller. - Merges operator overrides over the profile (operator wins). - Validates that hard capacity is known and not impossible (output cap cannot exceed combined window; capacities must be positive). - Defaults requested_output_tokens to the profile's default_output_reserve_tokens; rejects requests that exceed max_output_tokens. - Derives provider_input_limit_tokens as min(max_input_tokens, context_window_tokens - requested_output_tokens) using only the limits that are defined. - Asks tokenizer_registry for (adapter, counting_mode); records capability gaps in unknown_capabilities. - Computes the deterministic SHA-256/canonical-JSON fingerprint from the resolved contract and builds an immutable ModelCapacitySnapshot. The resolver stays pure: the SDK never reads DB or env; backend callers supply the capability_profiles dict and operator_overrides. This matches CLAUDE.md's SDK layer rules. Typed failures raised on invalid input: - ProviderCapabilityUnknown (no hard capacity) - InvalidCapacityConfiguration (non-positive values, output > window, derived input limit non-positive) - RequestedOutputExceedsCap (request above max_output_tokens) Tests (15, all passing): - Catalog lookup + override precedence - Uncataloged with operator-supplied capacity - Rejection: missing capacity, impossible values, negative values, requested-output overflow - Default requested_output behavior - Separate-input-limit path (synthetic, no day-one model uses it) - Combined window + separate input limit takes minimum - Snapshot immutability (Pydantic ValidationError on mutation) - Fingerprint determinism and sensitivity to request changes - Tokenizer estimated-mode flag appears in unknown_capabilities Design reference: doc/working/context-management-workstreams/ W1_ADR_Capability_Catalog_Storage_and_Fingerprint.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(W1 step 4): extend SDK ModelConfig with capacity fields, rename LLM output cap ModelConfig (sdk/nexent/core/agents/agent_model.py): - Add max_output_tokens as the preferred name per W1 ADR. - Keep max_tokens as a deprecated alias; a model_validator backfills the unset side so old and new callers both work during migration. - Add the remaining capacity-snapshot fields so a ModelConfig can carry the resolved values from backend service down to the SDK: context_window_tokens, max_input_tokens, default_output_reserve_tokens, tokenizer_family, capacity_source, capability_profile_version. OpenAIModel (sdk/nexent/core/models/openai_llm.py): - Accept max_output_tokens (preferred) and max_tokens (deprecated). If only the legacy name is passed, log a debug and remap to max_output_tokens. - Internal attribute renamed to self.max_output_tokens; self.max_tokens is kept as an alias for any reader. - chat.completions.create still receives wire field max_tokens; only the internal name changed. NexentAgent.create_model (sdk/nexent/core/agents/nexent_agent.py): - Construct OpenAIModel with max_output_tokens=model_config.max_output_tokens so the new name flows through end-to-end. Backward compatibility: - Existing callers that set ModelConfig.max_tokens see no behavior change (validator copies it into max_output_tokens; the wire payload is identical). - Existing callers reading OpenAIModel.max_tokens see no behavior change (alias attribute returns the same value). Verified by table-driven smoke test of all four (max_tokens, max_output_tokens) combinations on ModelConfig. Design reference: doc/working/context-management-workstreams/W1_*.md and W1 ADR. Provider adapters (step 3) and create_agent_info (step 6) follow. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(W1 step 6): wire ModelCapacityResolver in create_agent_info, drop legacy max_tokens Replaces the long-standing bug where `model_info['max_tokens']` (a deprecated output cap, semantically wrong) was assigned to ContextManagerConfig.token_threshold (an input/context budget). The fix wires ModelCapacityResolver into the runtime path so the context manager receives a real input budget derived from the capacity snapshot. Changes in backend/agents/create_agent_info.py: - Add _resolve_input_budget(model_info): pulls operator overrides from the new model_record_t capacity columns, calls resolve_capacity(...) with the CATALOG from backend.consts.capability_profiles, and returns snapshot.provider_input_limit_tokens. - On ProviderCapabilityUnknown (uncataloged model with no operator-supplied hard capacity), falls back to a safe constant _TOKEN_THRESHOLD_LEGACY_FALLBACK (8192) so the migration window doesn't break existing setups. Logged prominently so admins know to backfill. - create_agent_config: stops reading model_info['max_tokens'] and passes the resolved input_budget into ContextManagerConfig.token_threshold. - create_model_config_list: passes all seven new capacity columns (context_window_tokens, max_input_tokens, max_output_tokens, default_output_reserve_tokens, tokenizer_family, capacity_source, capability_profile_version) through to the SDK ModelConfig so end-to-end capacity flow works. This is the end of the legacy max_tokens-as-context-threshold confusion. ModelConfig.max_tokens stays as a deprecated alias per W1 step 4; this commit removes its only known misuse from the runtime path. The fallback constant is intentionally conservative — it kicks compression early for unmigrated models so behavior degrades gracefully rather than overflowing provider context. W2 will subtract its 10% uncertainty reserve on top of the resolver's output once enforcement phase begins. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(loop-engineering): add comprehensive insight report on Loop Engineering methodology and recommendations for Nexent's evolution * docs: add W1 ADR to ADRs directory Restore W1_ADR_Capability_Catalog_Storage_and_Fingerprint.md from doc/context-management-upgrade branch to context-management-workstreams/ADRs directory. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * feat(W1 step 8): emit capacity snapshot fields in monitoring Persist resolved model capacity snapshot metadata on model monitoring records so per-request telemetry can report total window, output reserve, safe input budget, source, tokenizer mode, unknown capabilities, and fingerprint. - add nullable monitoring columns to ORM, fresh-install SQL, and idempotent upgrade migration - bind resolved capacity snapshots from agent creation into SDK monitoring context - enrich LLM, client-level, and record_model_call monitoring rows with snapshot fields - cover enqueue and ORM payload behavior in SDK monitoring tests Verification: - env PYTHONPATH=/home/feiran/nexent/sdk:/home/feiran/nexent:/home/feiran/nexent/backend uv run --project /home/feiran/nexent/backend pytest --rootdir=/home/feiran/nexent --import-mode=importlib /home/feiran/nexent/test/sdk/monitor/test_monitoring.py - env PYTHONPATH=/home/feiran/nexent/sdk:/home/feiran/nexent:/home/feiran/nexent/backend uv run --project /home/feiran/nexent/backend pytest --rootdir=/home/feiran/nexent --import-mode=importlib /home/feiran/nexent/test/sdk/core/models/test_capacity_resolver.py - env PYTHONPATH=/home/feiran/nexent/sdk:/home/feiran/nexent:/home/feiran/nexent/backend uv run --project /home/feiran/nexent/backend python -m py_compile backend/agents/create_agent_info.py backend/database/db_models.py sdk/nexent/core/agents/agent_model.py sdk/nexent/core/agents/run_agent.py sdk/nexent/monitor/monitoring.py sdk/nexent/monitor/__init__.py Co-Authored-By: Codex <codex@openai.com> * feat(W1 step 3): surface provider-discovery capacity hints as candidates Expose provider-supplied token-capacity metadata as advisory candidate fields in discovery responses without promoting them into persisted model records. - add shared candidate extraction for common context, output, input, reserve, and tokenizer aliases - wire SiliconFlow, DashScope, TokenPony, and ModelEngine adapters to attach provider_candidate hints when present - keep prepare_model_dict from persisting provider_candidate fields automatically - cover positive and no-hint paths for provider discovery Verification: - env PYTHONPATH=/home/feiran/nexent/sdk:/home/feiran/nexent:/home/feiran/nexent/backend uv run --project /home/feiran/nexent/backend pytest --rootdir=/home/feiran/nexent --import-mode=importlib /home/feiran/nexent/test/backend/services/providers/test_silicon_provider.py /home/feiran/nexent/test/backend/services/providers/test_dashscope_provider.py /home/feiran/nexent/test/backend/services/providers/test_tokenpony_provider.py /home/feiran/nexent/test/backend/services/providers/test_modelengine_provider.py /home/feiran/nexent/test/backend/services/test_model_provider_service.py::test_prepare_model_dict_does_not_persist_provider_capacity_candidates - env PYTHONPATH=/home/feiran/nexent/sdk:/home/feiran/nexent:/home/feiran/nexent/backend uv run --project /home/feiran/nexent/backend python -m py_compile backend/services/providers/base.py backend/services/providers/silicon_provider.py backend/services/providers/dashscope_provider.py backend/services/providers/tokenpony_provider.py backend/services/providers/modelengine_provider.py Co-Authored-By: Codex <codex@openai.com> * feat(W1 step 7): expose capacity fields in Add/Edit Model forms Add explicit model-capacity controls to model management so operators can promote known capacity values through the existing model create and update flows. - extend frontend model types and service request/response mappings for capacity fields - add shared capacity form controls with tokenizer autocomplete, source badge, profile version text, and legacy max_tokens warning - wire capacity validation and operator payloads into Add/Edit Model dialogs - localize labels, tooltips, source names, and validation messages in en/zh Verification: - npm run type-check - node -e "const fs=require('fs'); for (const f of ['frontend/public/locales/en/common.json','frontend/public/locales/zh/common.json']) { JSON.parse(fs.readFileSync(f,'utf8').replace(/^\uFEFF/,'')); } console.log('locale json ok')" Co-Authored-By: Codex <codex@openai.com> * docs: review 5 findings (CM-017, CM-018, CM-021, CM-024, CM-025) Review and accept decisions for 5 findings: - CM-018: structural validation blocks commit, semantic quality routes to W15 SLO - CM-021: source lineage + mandatory presence validation blocks, semantic coverage to W15 - CM-024: use claim-scoped production readiness terminology - CM-017: finite initial conflict set with explicit unresolved failure - CM-025: subagent as independent agent with parent_session_id, async tool delegation, no recursion Updated: finding-review-decisions.md, findings-registry.md (20/26 complete), W4, W6, W10, W11, W12, W13, parent plan. Added: pending-findings-decision-sheet.md for decision tracking. Remaining 6 findings (CM-009, CM-010, CM-014, CM-015, CM-022, CM-026) pending individual discussion. * docs: accept CM-026 decision — exclude unsupported modalities from Release 1 gates Remove multimodal testing from Release 1 SLO gates. W15 covers text modality only; add modality contracts when specific product requirements emerge. Updated: finding-review-decisions.md, findings-registry.md (21/26 complete), W15, W3, pending-findings-decision-sheet.md. * docs: retire W7, merge checkpoints into W5 as compression.snapshot events Architectural simplification: checkpoints are no longer an independent subsystem (W7). Compression results are stored as compression.snapshot events within the W5 execution event log. Recovery finds the latest compression.snapshot event and replays subsequent events. Eliminates: - Independent checkpoint table and CAS concurrency control - Redis checkpoint cache layer - W8 checkpoint-specific validation - CM-014 checkpoint schema migration (covered by CM-005) - W7 publication outbox for cross-system consistency Updated: W5 (compression.snapshot event type, recovery flow, dirty-state flush), W6, W8, W9, W13, W14, W15, parent plan, README, review artifacts. Deleted: W7_Durable_Multi_Worker_Context_State.md. CM-014 marked N/A (22/26 findings complete). * fix(W1): clarify optional capacity fields * docs: accept CM-009 decision — defer workload envelopes until post-implementation measurement Do not pre-define workload envelopes. After W1-W16 implementation, use W15 measurement infrastructure to collect real performance data and define envelopes based on observed data. No production-scale claim until envelopes are defined. Aligns with CM-004 (measure before optimizing) and CM-011 (evidence-based gates). Progress: 23/26 findings complete. * docs: accept CM-010 decision — defer numeric targets until post-implementation measurement Do not pre-define numeric availability, RPO, RTO, rebuild time, queue lag, or storage capacity targets. After W1-W16 implementation, use W15 measurement infrastructure to collect real recovery/availability data per topology and define targets based on observed data. No production-scale claim until targets are defined. Aligns with CM-009 (measure before defining envelopes) and CM-011 (evidence-based gates). Progress: 24/26 findings complete. * docs: accept CM-015 decision — remove content hashing, use O(1) metadata validation W7 retirement eliminates the primary O(history) hashing consumer. Replace content hashing with metadata-based validation at three points: 1. compression.snapshot: partial_after_erasure + version fields 2. W6 materialized cache: snapshot validity + event count + version fields 3. Physical erasure: one-time partial_after_erasure flag No Merkle trees or segmented hashing needed. Storage-layer integrity handled by database checksums, not W8. Progress: 25/26 findings complete. * fix(web): bind production server to all interfaces * docs: accept CM-022 decision — consolidate decision traces into unified OpenTelemetry spec Consolidate all decision trace requirements (W5, W6, W10, W15) into a single unified telemetry/observability specification (low priority, post-core). Use OpenTelemetry-style spans/attributes/events collected by external observability infrastructure, not product-internal persistence. Updated: W15 (replace decision trace persistence with OTel output), parent plan (replace decision trace references with unified telemetry spec), finding-review-decisions.md, findings-registry.md (26/26 complete), pending-findings-decision-sheet.md. All 26 findings now reviewed and decided. * fix(W1 step 7): expose capacity fields in ProviderConfigEditDialog Step 7 added capacity controls to ModelEditDialog (the OpenAI-API-Compatible "custom model" edit path) but missed ProviderConfigEditDialog, the dialog opened by the per-model gear icon under provider-categorized sections (SiliconFlow / DashScope / TokenPony / ModelEngine). For any model whose model_factory matches a recognized provider — including the W1 catalog keys 'dashscope' / 'silicon' / 'tokenpony' — that gear icon was the only edit path, leaving operators no way to set context_window_tokens et al. Changes: - ProviderConfigEditDialog: accept optional initialCapacity and hideCapacityFields props; render ModelCapacityFields when supported; include capacity payload in onSave callback shape. - modelService.updateBatchModel: accept and forward the 6 capacity fields (context_window_tokens, max_input_tokens, max_output_tokens, default_output_reserve_tokens, tokenizer_family, capacity_source) to the existing batch_update_models endpoint, which already pass-throughs arbitrary update_data per backend/services/model_management_service.py line 347. - ModelDeleteDialog single-model gear path: pass current capacity values from selectedSingleModel as initialCapacity, and forward saved capacity fields into the updateBatchModel call. - ModelDeleteDialog provider-level "Edit Config" path: pass hideCapacityFields={true} since handleProviderConfigSave applies settings batch-wise to all models from one provider and per-model capacity is not a batch concept. No behavior change for callers that don't pass initialCapacity (backward compatible). Verified with npm run type-check. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: stabilize test_model_provider_service against dual-import sys.modules pollution Two tests (test_get_models_llm_success, test_get_models_embedding_success) failed intermittently when test_model_provider_service.py ran after test_capacity_resolver.py or test_silicon_provider.py. Root cause: silicon_provider is loaded under two distinct sys.modules keys — `services.providers.silicon_provider` (the path production code uses) and `backend.services.providers.silicon_provider` (the path some test files use). Each binding gets its own `SILICON_GET_URL` attribute because `silicon_provider.py` does `from consts.provider import SILICON_GET_URL`, which copies the value into the importing module's namespace. When both keys are present, mock.patch targeting only the `backend.` path silently fails to override the value used by the production code path that SiliconModelProvider.get_models executes. Fix: introduce _patch_provider_module_constant context manager that patches the named attribute on every loaded copy of the module. Apply to all four SILICON_GET_URL mock.patch sites in this file. Verification: - 289 tests pass under the previously-failing combined order: test/sdk/core/models/test_capacity_resolver.py + test/sdk/monitor/test_monitoring.py + test/backend/services/providers/ + test/backend/services/test_model_provider_service.py The helper is order-independent and safe even when one of the two sys.modules paths is absent. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(W1): record post-acceptance known limitations and open W17 for capacity-suggestion UX W1 ADR additions: - KL-1: catalog miss for default model_factory='OpenAI-API-Compatible'. Manual-add LLM rows skip the embedding-only _infer_model_factory path, fall through to ProviderCapabilityUnknown, and lose catalog values. Documented with the end-to-end workaround verified on 2026-06-15 for glm-5.1 (catalog hit confirmed via direct SQL UPDATE). - KL-2: provider-level batch Edit Config dialog hides capacity controls because they are per-model. Per-model gear icon path exposes them (fix landed 2026-06-16). New W17 workstream proposal: - POST /api/v1/models/suggest-capacity endpoint and frontend wiring. - Catalog fuzzy match + provider discovery, returns placeholders for the capacity form. Operator accepts → saved with capacity_source='operator'. - Subsumes the LLM gap in _infer_model_factory by replacing it with a shared host-to-provider map. - Phased rollout behind a feature flag, with SLO target of >=70% match rate on new manual-add LLM rows. Workstream README updated to index W17 under Model Capacity and Request Safety, with a dependency note linking to KL-1. The ADR remains Accepted. KL-1/KL-2 are post-acceptance discoveries that trigger the new workstream rather than reopen the ADR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: update W3 with dispatch path analysis and bypass elimination plan Add current dispatch path analysis: 1 chokepoint (openai_llm.py:186), 9 trusted paths, 2 production bypasses (B1: llm_utils.py, B2: conversation_management_service.py). Split step 9 into sub-steps: - 9a: Fix B1 (system prompt generation bypass) - 9b: Fix B2 (title generation bypass) - 9c: Credential isolation (architecture layer) Add bypass files to repository touchpoints. Add bypass elimination tests. * docs(W17): integrate post-acceptance workstream into both production plans Per classification decision (Option A): W17 sits in the existing "Model Capacity and Request Safety" module — same owners as W1-W3 — but is marked Medium / post-acceptance to distinguish it from the Blocker-level original freeze. This avoids creating a new module table for a single workstream while keeping the design-freeze boundary intact. Both plans: - §1.2 (en) / §1.1 (zh) per-workstream table: add W17 row labeled "Medium (post-acceptance)" / "中 (落地后增加)" linking to its spec. - New §1.4 (en) / §1.3 (zh) "Post-Acceptance Additions" section: explain that W17 was opened after the 2026-06-12 design freeze, triggered by KL-1 surfaced during the glm-5.1 end-to-end test. Document the KL- vs CM- finding prefix convention. - §2.3.1 module section: add a full W17 entry after W3 with status, problem, solution, proof, acceptance criteria, and the "post-acceptance, unscheduled" schedule note. - §3 Phase plan table: add a sixth row "Post-acceptance follow-ups" / "落地后增加" decoupled from Phase 0-5, with a clarifying paragraph that W17 and future KL-triggered work do not move the August 7 milestone. Frozen design-phase documents are NOT modified to avoid rewriting history: - context-management-weekly-design-summary-zh.md (2026-06-08 to 06-12 status) - review/findings-registry.md (26 CM- findings closed) - review/over-engineering-secondary-review.md ("no new unconditional workstream"; W17 is conditional on observed KL-1) - All review/phase*-review.md per-W reviews - W1_HANDOFF_remaining_steps_3_7_8.md (historical handoff, steps closed) The over-engineering guardrail still applies: W17 is conditional on the specific named limitation KL-1, not a new unconditional workstream. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(W1 step 7): unify max_tokens with capacity panel and migrate legacy on edit Frontend UX corrections discovered during W1 end-to-end testing: 1. Add Model dialog (single model) The standalone "Max Tokens *" field has the same semantic meaning as max_output_tokens in the capacity panel (W1 step 4 makes them aliases on the SDK side). Showing both is confusing and forced operators to type the same number twice. For LLM/VLM types the legacy field is now removed: - ModelCapacityFields gains a `formMode` prop. In 'add' mode the panel renders as a flat labelled section (no Collapse, no "empty hint" alert) and hides defaultOutputReserveTokens; required fields render a red asterisk and are enforced through validateCapacityForm. - ModelAddDialog passes formMode='add' with requiredFields=['contextWindowTokens', 'maxInputTokens']. The legacy Max Tokens input renders only when supportsCapacityFields is false (voice/rerank types still use it). - isFormValid drops isValidMaxTokens(form.maxTokens) when supportsCapacityFields is true; capacity validation is the source of truth. - The connectivity-verify config now reads form.maxOutputTokens for LLM/VLM (with parseMaxTokens fallback) since the standalone field is gone. - buildCapacityPayload mirrors maxOutputTokens into the deprecated maxTokens column so legacy readers that haven't been migrated yet still see the value, removing an implicit dependency on the SDK Pydantic alias firing on every backend code path. 2. Edit Model dialog yellow deprecation warning The warning "max_tokens 已废弃，请使用 max_output_tokens" fired even after the user typed a new max_output_tokens value, because the trigger read model.maxTokens / model.maxOutputTokens props instead of the live form state. capacityFormFromModel now auto-promotes a legacy model.maxTokens value into the form's maxOutputTokens on load so the operator sees the value pre-populated, and the warning condition adds a "&& !form.maxOutputTokens" check so it disappears as soon as the form has a value. Saving from there writes to the max_output_tokens column, which permanently clears the warning next time the row is loaded. Both invocations of ModelCapacityFields in ModelEditDialog (ModelEditDialog and ProviderConfigEditDialog) got the same correction. ProviderConfigInitialCapacity now exposes maxTokens so the helper can auto-migrate from the per-model gear path too; ModelDeleteDialog forwards selectedSingleModel.max_tokens. Locale strings added: - model.dialog.capacity.error.requiredMissing (en/zh) Verified: npm run type-check passes; locale JSON parses. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(W1 step 7): Add panel description gone; tokenizer shares row; Edit drops legacy max_tokens Two more UX corrections from W1 end-to-end testing: 1. Add Model panel cosmetic The "Optional Capacity Settings — used to override or confirm model capacity; leaving it empty will not block adding the model" header text sat above the capacity inputs in add mode but in 'add' mode the fields are part of the required form, so the "optional" framing was misleading and the body label/description duplicated info already on each input. Drop the header block in add mode; render content directly. Layout had four numeric inputs in a 2-column grid then a full-width tokenizer field underneath. That made row 1 = (context, input), row 2 = (output, ___), row 3 = tokenizer alone — an awkward orphan slot in row 2. In add mode the tokenizer now slots into the grid next to maxOutputTokens (no defaultOutputReserveTokens shown here), giving two tidy rows. Edit mode is unchanged: defaultOutputReserveTokens takes the fourth slot and tokenizer renders full-width below. 2. Edit Custom Model still showed both max_output_tokens and max_tokens Step 7 only stopped rendering the legacy maxTokens field in Add Dialog. The Edit Dialog continued to render it alongside the capacity panel's maxOutputTokens, defeating the merge the Add fix made. ModelEditDialog now hides the standalone maxTokens field when supportsCapacityFields is true, drops the corresponding isValidMaxTokens validation from isFormValid, and falls back to form.maxOutputTokens for the connectivity-probe maxTokens parameter (with parseMaxTokens(form.maxTokens) fallback so any pre-existing legacy value still works). Verified npm run type-check; locale untouched this commit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: clarify W4 step 4 and step 6 implementation details Step 4: Clarify that W4 verifies W5 schemas include identity columns rather than adding them (W5 owns the schema definition). Step 6: Keep deprecated APIs with deprecation notice for next version removal, rather than immediate removal. * fix(W1 step 7): required = context_window + max_output; drop Collapse; consistent across Add/Edit Corrections after the previous round's UX review: 1. Required fields were wrong. Previous commit required (contextWindowTokens, maxInputTokens). The correct W1 requirement is (contextWindowTokens, maxOutputTokens) — the two values that bound the request budget end-to-end. max_input_tokens stays optional because almost no real provider exposes a distinct hard input limit; the resolver falls back to context_window - requested_output when it's null. Updated three call sites: - ModelAddDialog: requiredFields and validateCapacityForm both ['contextWindowTokens', 'maxOutputTokens']. - ModelEditDialog inner panel: same requiredFields + same validation set. - ProviderConfigEditDialog inner panel: same. 2. Edit dialogs no longer Collapse the capacity panel. With context_window and max_output now required for both add and edit, hiding the inputs behind a Collapse hides the red asterisks until the user clicks the title. ModelCapacityFields drops the Collapse entirely and renders flat in both modes. The 'add' vs 'edit' formMode prop now only differentiates whether default_output_reserve_tokens is shown (it stays in edit, hidden in add) and where the tokenizer field sits (beside max_output in add, full-width in edit). 3. Empty-state hint suppressed when requiredFields is non-empty. The locale string `capacity.emptyHint` advised "you can fill these later", which contradicts required asterisks. Hide it whenever any requiredFields are passed; show only for the legacy advisory case. Verified npm run type-check. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: refine W5 implementation plan with sub-steps and clarifications - Split step 1 into 3 ADR sub-steps (taxonomy/schema, ordering/idempotency, evolution) - Split step 3 into 4 code path sub-steps (agent loop, tool execution, error/cancel, answer) - Add 4-phase migration plan to step 7 (shadow, read switch, write switch, remove direct writes) - Clarify new event-log database module responsibilities in Repository Touchpoints - Add performance baseline test requirement * docs(W17): close three self-review gaps before implementation Applied the W1 retrospective checklist to W17 (which I wrote after the retrospective and which still hit the same lessons). Three corrections: 1. Repository touchpoints missed sibling frontend components. The original list named ModelAddDialog, ModelEditDialog, and ModelCapacityFields but omitted ProviderConfigEditDialog (the per-model gear icon dialog) and ModelDeleteDialog (the provider browser). Both are valid model-add entry points and the suggestion logic must reach them, or W17 reproduces W1 step 7's "only ModelEditDialog got the new fields" miss. 2. Frontend implementation plan was 3 items hiding 7 concerns. Expanded into 7 numbered items grouped by concern: service layer (4), form state machine with suggested/operator distinction (5), debounce trigger and no-match graceful fallback (6), match_explanation Alert rendering (7), coverage of all three add paths including provider browser (8), error-mode contract (9), and locale strings (10). 3. No operational dependencies section. Added a table covering which containers need rebuilding (nexent-runtime + nexent-northbound + nexent-config + nexent-mcp for backend; nexent-web for frontend; nexent-postgresql untouched), new env var CAPACITY_SUGGESTION_ENABLED, optional per-tenant flag in tenant_config_t for staged rollout, monitoring dashboards to add, rollout sequence (staging → one internal tenant → paid → all), and rollback procedure (env var off → no schema cleanup needed). These three corrections come from the W1 spec review checklist that this commit was the trigger to formalize. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(W2 review): formalize six-item checklist from W1 retrospective; apply to W2 Two new documents: SPEC_REVIEW_CHECKLIST.md — the reusable artifact. Codifies the W1 post-acceptance retrospective's six lessons as a checklist with concrete sub-questions per item: 1. User Journey — who sees what change end to end 2. Frontend Step Decomposition — ≥3 sub-items covering state / visual / service / validation / migration / siblings 3. End-to-End Demo Script in Acceptance — concrete, copy-pasteable, with negative path 4. Operational Dependencies — containers / migrations / env vars / flags / runbook / monitoring 5. Sibling Components Enumerated — every dialog / function / column / module-key sibling named or explicitly out of scope 6. Reverse-Test "Can the user actually use this" — operator can know feature is active, can reach values from UI, can observe fallback W2_REVIEW.md — applies the checklist to W2 + the four reader-surfaced issues the user spotted independently: Item 1: User Journey — 🔴 missing Operator-Visible Effects section Item 2: Frontend Decomposition — 🔴 no decision on UI for soft_limit_ratio / per-agent override Item 3: End-to-End Demo — 🟡 abstract, demo script proposed Item 4: Operational Dependencies — 🟡 nothing-to-do but unstated Item 5: Sibling Components — 🔴 six current local-reserve sites in agent_context.py not enumerated; W2→compaction handoff missing Item 6: Reverse Test — 🟡 no operator-visible activity indicator Issue A: soft_limit_ratio default unspecified — recommend 0.8 Issue B: requested_output_tokens override location undefined — per-agent (DB column + agent-edit UI) vs per-request (API body) are two distinct contracts buried in one sentence Issue C: W2 ↔ W13 compaction-model relationship undefined — each model call needs its own W1→W2 chain; W2 spec must say snapshots are per-model, not shared (same defect class as the W1 catalog problem) Issue D: Step 5 "consistent" semantics ambiguous — clarify it's the CM-013 trusted-dispatch enforcement contract, not a rename Verdict: W2 spec is not Ready to Implement; 7 of 10 items need updates. None invalidate the architecture — they are under-specifications that would reproduce W1-style post-acceptance surprises if shipped to implementation as-is. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(review): convert W2 post-acceptance review to CM-NNN format under review/ Removed W2_REVIEW.md from the workstreams folder — wrong location and wrong format, did not follow the established phase2-w*-review.md convention (concise per-W file + central findings-registry.md). Re-published in the correct shape: - review/findings-registry.md: added CM-027 through CM-030 with Severity / Delivery classification / Affected documents / Description / Minimum non-over-engineered response columns matching the existing 26 design-phase entries. Severity Summary updated (was 4/10/7/5 = 26, now 4/12/9/5 = 30). - review/phase6-w2-review.md: new file in the same concise format as phase2-w*-review.md. Phase 6 is defined here as the post-acceptance review track opened after the W1 retrospective, distinct from Phase 2 (design-phase per-W reviews) — same numbering convention, different trigger. The four findings translate the W1 retrospective lessons + user-surfaced W2 issues into CM-style entries: CM-027 Medium — soft_limit_ratio default unspecified; min response set default 0.8 with per-tenant override path. CM-028 Medium — per-agent vs per-request override are two contracts in one sentence; min response specify both and decide W2 scope. CM-029 High — per-model snapshot rule unstated; W13 compaction call needs its own W1->W2 chain (same defect class as W1 KL-1). CM-030 High — Step 5 "consistently" is the CM-013 trusted-dispatch enforcement contract, not a rename; min response add server-side assertion + negative test. The W17 follow-up workstream's KL-1/KL-2 references in W1 ADR and the production plans remain in the KL- namespace for now; migrating those to CM- can happen in a separate consistency pass if desired. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: refine W6 with projection priority, ContextItem scope, and implementation clarifications - Add projection implementation priority (Release 1 required/optional/deferred) - Clarify which projections produce full ContextItem vs simple records - Define 'zero semantic mismatch' criteria for chat shadow comparison - Clarify W8 validation call pattern in Phase 3 step 3 - Add performance baseline test requirement in Phase 4 - Clarify backend projection registry responsibilities * docs: update W8 to align with CM-015 decision (remove content hashing) Replace content-based hashing with O(1) metadata-based validation: - compression.snapshot: partial_after_erasure flag + version field comparison - W6 materialized projections: snapshot validity + event count + version fields - Physical erasure: one-time partial_after_erasure flag propagation Updates: - Validity Contract: remove content hash, add metadata validation inputs - Implementation Plan step 2: replace streaming hashing with metadata validation - Implementation Plan step 4: use DerivedStateValidator (not CheckpointValidator) - Implementation Plan step 7: 'derived state' instead of 'checkpoint' - Validation and Invalidation Delivery: remove canonical serialization/hash algorithm - Add CM-015 finding reference * docs: unify finding namespace (KL-* → CM-*), close 9 review decisions, fix W13 dep stale W7 Three coordinated cleanups in one commit: 1. KL-* → CM-* migration (consistency with established review namespace) The KL- prefix was a one-off I introduced earlier to mark post-acceptance findings as distinct from the 26 design-phase CM- findings. Per the established review-folder convention (see review/findings-registry.md + review/finding-review-decisions.md), all findings should share one CM-NNN namespace regardless of when they were discovered. Renames: KL-1 → CM-031 (catalog miss for default model_factory) KL-2 → CM-032 (provider-level batch dialog cannot host per-model capacity) Updated references in: W1 ADR (Known Limitations section, kept the "formerly KL-1/KL-2" parenthetical as an audit trail), W17 spec, context-management-production-plan.md and -zh.md (§1.4 / §1.3), README workstream index W17 row, SPEC_REVIEW_CHECKLIST.md, and review/phase6-w2-review.md. Removed the "落地后局限使用 KL-N 前缀" explanation from both production plans since the namespace is now unified. 2. CM-027 through CM-032 added to review/finding-review-decisions.md Six new finding-decision sections written in the same format the team established for CM-001 through CM-026: Decision / Approved minimum / Rationale / Explicitly out of scope / Updated documents. Covers: CM-027 W2 soft_limit_ratio default = 0.8 CM-028 requested_output_tokens override = per-agent column + per-request API field, two distinct contracts CM-029 Per-model snapshot rule for secondary model dispatch (W13) CM-030 W2 Step 5 = CM-013 trusted-dispatch enforcement, not rename CM-031 catalog miss for default model_factory (formerly KL-1) CM-032 provider-level batch dialog cannot host per-model capacity (formerly KL-2) 3. README W13 dependency W7 → W5 After the team's W7 retirement merge, README line 49 still listed W13's dependencies as "W2, W3, W7". Updated to "W2, W3, W5" since W7's checkpoint/snapshot responsibilities are now W5 compression.snapshot events. 4. findings-registry.md Severity Summary updated Was 4/12/9/5 = 30 after merge. After adding CM-031 (Medium) and CM-032 (Low), now 4/12/10/6 = 32. 5. English production-plan W7 residuals checked The four W7 mentions remaining in context-management-production-plan.md (workstream-table row, w7 anchor, retired heading, retirement-context bullet listing what is NOT being adopted from W7) are intentional historical markers in the W7 retirement section and were left in place. Net change: ~20 lines across 9 files, no code, no migration. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: update W9 with terminology fixes, resolve_ambiguous_effect, and subagent conflict check - Replace 'checkpoint' with 'compression.snapshot' throughout - Add resolve_ambiguous_effect to implementation order (step 4) - Add subagent conflict check: reject mutating lifecycle operations when parent session has pending subagent sessions, even after parent run's active_run_id is cleared (async subagent scenario) - Add subagent conflict test - Add subagent session query to repository touchpoints * docs: refine W10 with deprecation notice, subagent policy independence, and performance tests - Step 7: Mark bypass paths as deprecated (not immediate removal) - Add Subagent Policy Independence section: subagents resolve their own W10 policy; parent policy governs subagent result integration - Add performance baseline test requirement for policy resolution and context selection latency * docs: refine W11 with subagent reducer independence and step 3 clarification - Step 3: Clarify deterministic reducers (structured, pointer) generate on demand; semantic reducers (compressed) cache at creation/update since regeneration involves LLM calls - Add Subagent Reducer Independence section: subagents use their own reducer chain; parent reducers do not apply to subagent internal context - Add performance baseline tests to tests section (lower priority, after functional implementation is stable) * docs: refine W12 with offload threshold clarification, subagent artifact isolation, and performance tests - Step 6: Replace 'observation limits' with 'offload thresholds' — outputs exceeding threshold are stored as artifacts with pointers (full content preserved), not truncated. Context space decisions remain with W10/W3. - Add Subagent Artifact Isolation section: subagent artifacts scoped to subagent session; parent cannot directly access subagent artifacts. - Add performance baseline tests (lower priority, after functional implementation is stable). * docs: update W13 with current state gap analysis and implementation refinements - Add Current State and Gap Analysis section: maps current agent_context.py implementation against W13 requirements, identifies 21 gaps (16 critical) and 5 existing strengths - Add Compression Trigger Conditions: W2 soft_limit_ratio as primary trigger, two-phase thresholds as implementation details - Add Fallback Model Selection Strategy: primary → fallback → W11 hard reduction cascade - Step 4: Add measurable progress criteria (compressed tokens < source tokens, reject with no_progress if not) - Add Subagent Compression Independence section: subagent sessions use own CompactionPolicy independently - Add performance baseline tests (lower priority, after functional implementation is stable) * docs: refine W14 with deprecation notice, subagent governance, and performance tests - Step 9: Mark raw/direct write paths as deprecated (not immediate removal) - Add Subagent Governance section: subagent sessions apply W14 internally using their own agent configuration; subagent final answer is already governed output; parent W10 policy governs integration; W14 does not re-redact already-redacted content - Add performance baseline tests for redaction latency and deletion propagation latency (lower priority, after functional implementation) * docs: clarify W15 step 1 baseline timing and performance coordination - Step 1: Clarify that baseline measurements should be established before W1-W14 implementation starts (required to quantify improvement) - Required Deliverables: Add note that W15 coordinates performance baseline tests across W5, W6, W10, W11, W12, W13, and W14 (lower priority but W15 defines measurement standards and targets) * docs: add W16 subagent cache optimization and performance baseline priority - Add Subagent Cache Optimization section: subagent sessions apply W16 independently using their own agent configuration; cache partition plan scoped to subagent session - Add note that repeated-turn performance baseline tests are lower priority (after functional implementation is stable) * docs: renumber W-IDs to match new development sequence Renumbered all W-ID documents to follow the optimized development order: Original → New mapping: - W1 (Capacity Config) → W1 (unchanged) - W2 (Safety Reserve) → W2 (unchanged) - W4 (Tenant Isolation) → W3 - W5 (Event Log) → W4 - W6 (History Separation) → W5 - W8 (Cache Validation) → W6 - W9 (Lifecycle APIs) → W7 - W10 (Unified Policy) → W8 - W11 (Progressive Reduction) → W9 - W12 (Output Control) → W10 - W14 (Trust/Redaction) → W11 - W13 (Reliable Compaction) → W12 - W15 (Quality SLOs) → W13 - W16 (Cache-Aware Assembly) → W14 - W3 (Guaranteed Fit) → W15 This reordering ensures: - No forward dependencies (each W-ID only depends on earlier W-IDs) - W15 (Guaranteed Fit) comes after W14 (Cache-Aware Assembly) which it consumes - W12 (Reliable Compaction) comes after W11 (Trust/Redaction) which it depends on - W3 (Tenant Isolation) comes before W15 (Guaranteed Fit) which needs it Updated all internal W-ID references across all documents. * docs: update production plan with new W-ID order and phase structure - Update Section 1.1: 16→15 workstreams, module table W-IDs - Update Section 2.1.2: Checkpoint→Compression Snapshot terminology - Update Section 2.2: Architecture diagram (Checkpoints→Compression Snapshots) - Update Section 2.3: Workstream descriptions with all refinements - W15: Add dispatch bypass elimination (B1, B2) - W10: Clarify offload threshold vs truncation - W12: Add current state gap analysis reference - W14: Add subagent cache optimization - Update Section 3.1: Phased delivery plan for new W-ID order - Phase 1: W1, W2, W3 (Foundation) - Phase 2: W4, W5, W6 (Event Infrastructure) - Phase 3: W7, W8, W9, W10, W11 (Lifecycle and Policy) - Phase 4: W12, W14 (Compaction and Assembly) - Phase 5: W13, W15 (Quality and Fit) - Update Section 3.2: Gantt chart for new timeline - Update Section 3.3: Dependency diagram for new order * docs: fix all W-ID anchor links in production plan Fixed 52 incorrect anchor links throughout the production plan document. All [W\d+](#w\d+) links now correctly match the new W-ID numbering: - W1-W15 links now point to correct anchors (#w1-#w15) - Updated Section 0.1-0.3 comparison tables - Updated Section 1.2 detailed improvement table - Updated Section 2.3 memory control capabilities table - Updated Section 2.4 ClawVM adoption table - Updated Section 3.1 phase table All anchor links now follow the pattern [Wn](#wn) where n matches. * docs: revise W17 capacity suggestion spec * docs: rewrite Chinese production plan with new W-ID numbering - Translate updated English version (1296 lines → 1208 lines Chinese) - Move from doc/working/ to doc/working/context-management-workstreams/ - Update all W-ID references to new numbering (W1-W15) - W7 marked as retired (compression.snapshot merged into W4) - New phase structure (5 phases with correct W-ID groupings) - Professional terms kept in English where appropriate - Mermaid diagrams preserved in English - Old file deleted from previous location * docs(W2): add ADR for budget snapshot overrides and dispatch enforcement Add W2_ADR_Budget_Snapshot_Overrides_and_Dispatch_Enforcement.md defining: - Override precedence: operator column > model default > resolver fallback - Fingerprint algorithm: SHA-256 over W1 fingerprint + W2-specific fields - DB column: ag_tenant_agent_t.requested_output_tokens nullable positive int - SDK dispatch assertion: max_tokens must equal snapshot.requested_output_tokens This ADR formalizes the contracts identified in CM-028, CM-029, CM-030 and provides the design anchor for W2 implementation steps 3-5. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(W2): absorb CM-027-CM-030 findings into spec and production plan W2 spec updates: - CM-027: soft_limit_ratio default 0.8, per-tenant override via tenant_config_t - CM-028: two distinct override contracts (per-agent column + per-request API field) - CM-029: snapshots are per-model; W13 must invoke W1→W2 chain for compaction model - CM-030: CM-013 trusted-dispatch enforcement at provider call (assert max_tokens == snapshot.requested_output_tokens) Production plan updates: - Per-agent column and per-request API field documented - soft_limit_ratio default and override path - per-model snapshot chain for compaction (W13 dependency) - dispatch assertion contract All four findings from W2 post-acceptance review now integrated into the spec. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Add W2 capacity budget skeleton * docs: remove retired W7 strikethrough row from Chinese production plan table * Add W2 reserve policy configuration * Implement W2 safe input budget calculator * docs: add Chinese translations for all W-ID specification documents (W1-W17) * Resolve W2 request safe input budget * Apply W2 safe budgets to context manager * Enforce W2 output tokens at dispatch * Emit W2 budget snapshots to monitoring * Surface W2 uncertainty reserve warning * Verify W2 budget fingerprint at dispatch * Verify W1 capacity identity at W2 dispatch Defense-in-depth check per CM-013: the trusted dispatch boundary now rejects a W2 safe-input-budget snapshot whose `w1_fingerprint`, `provider`, or `model_name` disagrees with the active W1 capacity snapshot threaded alongside it. This closes the model-swap mid-flight, stale-cache, and cross-tenant snapshot-reuse failure modes that the prior self-only fingerprint check would silently let through. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Backfill W2 capacity from W1 catalog for legacy deployments W1 step 7 made context_window_tokens and max_output_tokens required at the Add/Edit forms, but pre-existing model_record_t rows in production deployments still have NULL capacity columns and silently disable W2's CM-030 dispatch enforcement. This migration auto-fills the eight W1 day-one catalog entries on rows where (LOWER(model_factory), model_name) matches and capacity is still NULL. It is idempotent (re-runs are no-ops) and ships as a regular docker/sql migration so every downstream deployment picks it up on upgrade. Rows whose model_factory does not match a catalog provider key (commonly the manual-add default 'OpenAI-API-Compatible' per CM-031) are left untouched; the resolver fallback log is upgraded to WARNING with an actionable remediation message so operators can identify exactly which models still need attention before W17 ships. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: add codebase gap analysis, reorder priorities, mark deferred workstreams - Add §1.5 Codebase Gap Analysis to both EN/ZH production plans - Update §1.2 improvement table with Status column and new priority order - Move W14 (prompt cache) to Phase 1: high value, zero dependencies - Mark W5, W6(full), W8(full), W10(artifact), W11(full) as tentatively deferred - Update Phase table, descriptions, Gantt chart, and dependency diagram - Add gap analysis notes to W3, W4, W6, W8, W10, W11, W12, W14 docs - Restructure README workstream index: Active / Deferred / Retired sections * Make missing-capacity warning operator-friendly and dedup it Two fixes to the WARNING surfaced when a model has no capacity configured: 1. Drop internal design-doc jargon. The previous message mentioned CM-030, CM-013, and W17 — none of which are meaningful to an operator reading backend container logs. Replaced with plain English that names what is disabled (output token cap + budget consistency check) and the exact UI path to fix it. 2. Deduplicate per process per model_id. Without this, every agent run logged the same line, so a tenant with 1k daily messages on a bare model would emit 1k duplicate warnings per day and drown real signal. A module-level set tracks already-warned model_ids; the warning fires once per process per model and is cleared only on process restart. Includes the ResolverError branch which previously had a separate WARNING line — both branches now route through the same dedup helper. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(W17): add visibility surfaces for existing bare-capacity models W17's original scope was preventing new bare rows at add/edit time. It did not address the complementary problem: rows that already exist in a bare state silently disable W2 enforcement, and the only signal today is a backend WARNING that the people who can fix it (model administrators, agent authors) never see. Adds a new "Visibility for Existing Bare-Capacity Models" section specifying three UI touchpoints — model management list badge, agent-edit selector warning, and an operator dashboard widget — backed by a small read-only GET /api/v1/models/capacity-coverage endpoint. The visibility work is phase-tagged as 1.5 so it can ship behind a separate small flag without waiting for the connectivity-integration and provider-discovery work in later phases. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: renumber W-IDs by priority, rename deferred to P-IDs Active workstreams renumbered by implementation priority: W1 (token capacity), W2 (output reserve) - unchanged W3 (prompt cache, was W14) - moved to Phase 1 W4 (tenant isolation, was W3) W5 (event log, was W4) W6 (compaction reliability, was W12) W7 (lifecycle APIs) - unchanged W8 (progressive reduction, was W9) W9 (quality SLOs, was W13) W10 (guaranteed fit, was W15) W11 (capacity suggestion, was W17) Deferred workstreams renamed W→P: P1 (history separation, was W5) P2 (cache validation, was W6) P3 (context policy, was W8) P4 (pollution control, was W10) P5 (trust/redaction, was W11) 58 files updated: spec files, translations, production plans, README, ADR, review documents, weekly summary. * Fix soft-delete column name in W2 catalog backfill migration The migration filtered on a non-existent column `deleted_flag = 0`, which never matched any row, so the backfill silently no-op'd on every deployment. The model_record_t soft-delete column is `delete_flag` (String(1), default 'N') per backend/database/db_models.py. Verified on the local cluster: with the corrected filter, the migration matched the one catalog-eligible row (glm-5.1 on dashscope) and populated context_window_tokens=200000, max_output_tokens=131072. Remaining bare rows on the cluster all carry model_factory='OpenAI-API-Compatible' (CM-031), confirming W17 as the remediation path for the default-factory population. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(W17): add bare-row production evidence and scope to LLM/VLM only Two additions to the W17 'Visibility for Existing Bare-Capacity Models' section: 1. Production evidence: a 2026-06-17 snapshot of model_record_t on a live dev cluster showed 6 of 7 non-deleted rows carrying the manual-add default model_factory ('OpenAI-API-Compatible'), and the W2 catalog backfill matched only 1 row — leaving the model the operator was actively chatting with (glm-5) bare. This grounds the workstream's motivation in a concrete observation rather than a projected concern. 2. Scope clarification: embedding, STT, and TTS rows share the same capacity columns but never traverse the W1/W2 path, so a NULL on those rows is not a missed enforcement. The badge, agent-edit selector notice, dashboard widget, and /capacity-coverage endpoint all apply a model_type IN ('llm', 'vlm') filter at the data layer to prevent noise on non-LLM rows. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Raise legacy fallback threshold to 81920 and explain output reserve in UI Two coordinated changes that both came out of W2 end-to-end validation against a bare-capacity model (glm-5): 1. Bump the W1/W2 unknown-capacity fallback from 8192 to 81920 in both backend (_TOKEN_THRESHOLD_LEGACY_FALLBACK) and frontend (TokenUsageIndicator.DEFAULT_THRESHOLD). 8192 was so small that any non-trivial conversation triggered compression almost immediately, masking real usage signal. 81920 fits the input budget of any modern 32K+ LLM; if the actual model is smaller and bare, the provider returns a clear token-overflow error at request time rather than the system silently truncating. Both sides match so the indicator denominator and the backend compression trigger stay in sync when the snapshot path is not available. 2. Add a tooltip on the agent-edit "Output Reserve" form item so model admins and agent authors understand the field's physical meaning: it carves output space out of the context window, and the trade-off between longer replies versus more retained history is explicit. Tooltip strings live in both zh and en common.json. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Retune legacy capacity fallback from 81920 to 32768 After bumping the bare-capacity fallback up from 8192 to 81920 in commit 689e3ec52, 81920 was on the optimistic side: it presumes most unknown models can absorb ~80K tokens of input. Many production deployments still rely on the 32K-context band (GPT-3.5 Turbo 16K, GLM-4 32K, Qwen2 32K, Llama 3 32K, Mistral 32K, etc.), and an 80K input on a 32K model produces a provider-side token-overflow rejection. 32768 is the conservative compromise: it covers the majority of production LLMs without inviting overflow on the still-common 32K class. Models with larger windows lose only a few extra compression cycles, which is the correct cost direction (slightly more work over silent overflow). Backend (_TOKEN_THRESHOLD_LEGACY_FALLBACK) and frontend (TokenUsageIndicator.DEFAULT_THRESHOLD) stay in sync so the indicator denominator matches the backend compression trigger when the W2 snapshot path is unavailable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: add capacity values explainer covering W1/W2/W3 number flow Single-file reference doc walking from UI-visible capacity columns (context_window, max_output, default_reserve) through W1 resolver output (provider_input_limit, fingerprint), W2 calculator output (soft / hard input budget, uncertainty reserve), and the four-tier override chain for requested_output_tokens (CM-028). Includes worked examples for the standard configuration, agent-level override, the RequestedOutputExceedsCap failure mode, and the bare-capacity fallback path. Intended audience: model admins, agent authors, and engineers reviewing W1/W2/W3 specs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Enforce output reserve ceiling at the agent-edit form Closes the UX gap where 'Output Reserve' accepted values exceeding the selected model's max_output_tokens. The capacity resolver caught the violation only at agent run time, raising RequestedOutputExceedsCap and failing the conversation with no surface signal to the agent author. Three additions on AgentGenerateDetail: - A conditional Form.Item rule that pins the field's max to the currently selected model's maxOutputTokens. The rule is omitted on bare-capacity models (maxOutputTokens undefined) where the resolver cannot enforce anything anyway. - A matching `max` prop on the InputNumber so the stepper UI also blocks the value, not just the validator. - A useEffect that re-runs validation on requestedOutputTokens whenever the selected model's maxOutputTokens changes, so switching from a 32K-output model down to an 8K-output one immediately surfaces the conflict rather than waiting until save. New i18n key agent.requestedOutputTokens.maxError interpolates the actual ceiling so the error message names the number. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Reject max_input_tokens > context_window_tokens on both ends Closes the audit gap noticed alongside the W2 UX fix: an operator fills max_input_tokens above context_window_tokens, the save succeeds, and the override is silently clipped at runtime because the resolver computes provider_input_limit = min(max_input, context_window - requested_output). The administrator's value never takes effect and no error or log surfaces. Backend fix in capacity_resolver: raise InvalidCapacityConfiguration with a message that names the silent-clipping mechanism so the operator understands why the override was rejected. The check sits right next to the sibling max_output_tokens > context_window check, keeping all cross-field invariants in one place. Frontend fix in validateCapacityForm: add the same cross-field check with a matching i18n key (model.dialog.capacity.error.inputExceedsWindow, zh + en). Surfaces inside the existing ModelEditDialog and ModelAddDialog save flow that already wires validateCapacityForm. Tests: two new cases on test_capacity_resolver — rejection of max_input above the window, and acceptance of the equality boundary (max_input == context_window is legal). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Raise SDK requested_output_tokens fallback from 1024 to 4096 The four-tier override chain for requested_output_tokens ends with a hard-coded SDK constant when neither the agent ('Output Reserve' field) nor the model record (default_output_reserve_tokens column) provides a value. The model-add UI does not render default_output_reserve_tokens at all (only edit mode does), so newly added rows always carry NULL in that column and most agents reach the SDK fallback at runtime. 1024 was too small in practice. Tool-using agents emit a few-hundred- token JSON tool call plus a few hundred tokens of thought per step; 1024 frequently truncated the JSON mid-emission, which then surfaced as a tool-call failure instead of a capacity-config issue. The W2 fingerprint chain stays green and the indicator denominator looks healthy, but replies and tool calls get silently chopped. 4096 covers the median single-turn output for tool chains, short reports, and modest code generation. Models with a smaller max_output_tokens are still safe: the existing RequestedOutputExceedsCap check at capacity_resolver.py:276-283 (and the matching agent-edit Form.Item rule from the prior commit) catches the violation explicitly rather than silently truncating. No tests assumed 1024; the full test_capacity_resolver suite stays green (17 passing). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: refresh Capacity Values Explainer after UX gap fixes Sync the explainer with the just-landed capacity changes so the doc stops describing the older silent-failure behavior: - Override chain (§3) now names the SDK fallback as 4096 (was 1024) and includes a short note o…

* Move non-shadcn ui component to other folder * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix inability to select agent from agent space to edit * Bugfix: Display correct version info when viewing agent details * Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field * Refactor: update left navigation menu * 删除快速配置页面 * 删除注释 * 更新i18n * Bugfix: Fix i18n translation issues in navigation sidebar

* 🐛 Bugfix: Update HTTP client settings to increase timeout and disable SSL verification in aidp_service and aidp_search_tool (#3280) * 🐛 Bugfix: Fix page show * 🐛 Bugfix: Prevent saving null values in tool parameters across backend and frontend components. Ensure only defined values are used when merging and updating tool configurations. * 🐛 Bugfix: Ensure `useSaveGuard` returns true upon successful save and update unit tests to reflect changes in return type for tool instance creation and update.

* Refactor prompt and skill assets * Add unified uninstall entrypoints and image build selection * Expand image build script with interactive selection * Simplify image build defaults and remove deprecated deploy scripts * Refactor prompt and agent infrastructure * Make SQL migrations idempotent * Ignore legacy env when config values are loaded * Add secret rotation and Elasticsearch key refresh support * Remove obsolete init SQL comments * Update NEXENT_SQL_STARTUP_MODE to 'off' and enhance deployment scripts * Add shared hostPath storage for workspace and skills * Refactor image builds for variant-specific dependencies * Refactor prompt handling and improve agent workflow * fix: remove obsolete comment on skill configuration parameters in migration file * fix: update offline package build process to create zip instead of tar.gz --------- Co-authored-by: hhhhsc <name>

* Release/v2.2.1 (#3269) * add_greeting_fields_to_agent-develop * feat(knowledge-base): add preserve_source_file and post-index source cleanup Let knowledge bases opt out of keeping uploaded MinIO copies after indexing while retaining Elasticsearch chunks for retrieval. Default behavior remains preserve_source_file=true for backward compatibility. - Add preserve_source_file column (init.sql + v2.2.0_0601 migration) - Accept preserve_source_file on create/update and northbound/vector APIs - Support document DELETE scope=source_only and source_available in listings - Run cleanup_source Celery task when preserve_source_file is false - UI: create-KB toggle, list tag, knowledge-base preview when copy is missing - Update vector-database SDK docs and backend tests * test(data_process): stub knowledge_db, redis_service, and redis in test_worker Align setup_mocks_for_worker with test_tasks so importing backend.data_process.worker loads package __init__ without real DB/redis deps. * test(data_process): shim cleanup_source for submit_process_forward_chain tests * remove duplicate import * fix: update unit tests for greeting_message and example_questions fields * add init.sql to sonar.properites * ♻️ Improvement: API to MCP conversion service supports configuring headers. (#3194) * ♻️ Improvement: API to MCP conversion service supports configuring headers. [Specification Details] 1. Front-end and back-end modifications * ♻️ Improvement: API to MCP conversion service supports configuring headers. [Specification Details] 1. Modify the frontend, after adding, set the HTTP headers to empty. 2. Modify test cases. * ♻️ Improvement: Enhance processing of ES index names in memory banks. (#3196) [Specification Details] 1. Replace all symbols in the index name that do not meet the rules with "_". 2. Modify test cases. * feat: add active memory tools (StoreMemoryTool, SearchMemoryTool) (#3197) - Implement StoreMemoryTool for explicit memory storage during agent reasoning - Implement SearchMemoryTool for on-demand memory retrieval during conversations - Integrate tools into agent creation flow (create_agent_info.py) - Register tools in nexent_agent.py and tools/__init__.py - Add MEMORY_OPERATION tool sign for proper categorization - Fix memory_core.py cache key to include event loop ID (prevents cross-loop conflicts) - Add comprehensive test coverage for both tools - Add procedural memory verification documentation Tools follow existing patterns: lazy imports, observer integration, error handling, and respect user memory preferences (agent_share_option, disabled_agent_ids). Co-authored-by: Dallas98 <40557804+Dallas98@users.noreply.github.com> * 🐛 Bugfix: skill names and descriptions never load to context (#3205) * 🐛 Bugfix: skill names and descriptions never load to context * 🐛 Bugfix: skill names and descriptions never load to context * 🐛 Bugfix: skill names and descriptions never load to context * 🐛 Bugfix: official skills not copied to target directory * 🐛 Bugfix: official skills not copied to target directory * Feat: add selected count badges to tool/skill pool labels (#3206) Co-authored-by: chase <byzhangxin11@126.com> * 🐛 Bugfix: Fix attribution error when tool calling error (#3208) * ✨ Feat: Add support for Word document generation, preview, and download (#3191) * Feat: Add support for Word document generation, preview, and download * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Restrict uploads to a known safe workspace/output directory * 修改单元测试 * 修复单元测试 * Bugfix: Store uploaded files in Minio for conversation messages to enable file visibility in history --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * ✨Feat:Enhance prompt optimization by integrating openjiuwen and fix related bugs (#3190) * ✨Feat：add prompt optimization * 🐛Bugfix: dockerbuild failed when running pipefail in python3_11 * 🔨Optimize： Optimize prompt optimization display page and interaction methods * 🐛Bugfix: fix dependencies replication * 🎨:Optimize frontend prompts and loading interface * 🔧 Refactor: Update imports and remove redundant ENABLE_JIUWEN_SDK import in prompt_service.py * 🔧 Refactor: Correct import path for NexentCapabilityError and enhance test coverage for prompt optimization service * 🔧 Refactor: Update import paths for exception handling and improve logging formatting in prompt_service.py * 🔧 Refactor: Simplify lazy imports in jiuwen_sdk_adapter.py and update import paths in prompt_service.py * 🔧 Refactor: Enhance Jiuwen SDK adapter handling and improve test stubs in prompt_service.py and related test files * 🧪test:Pydantic model for PromptTemplateRequest in test_prompt_template_app.py * 🔧 Refactor: Remove unnecessary dependency exclusions from pyproject.toml * 🔧 Update: Upgrade huggingface_hub dependency version in pyproject.toml * 🔧 Update: Exclude unnecessary transitive dependencies and adjust huggingface_hub version in pyproject.toml * 🔧 Test: Add mock modules for unstructured inference and set up package paths in test files * 🔧 Test: Enhance test setup by adding optional SDK mocks and cleaning up module imports in data processing tests * 🔧 Test: Consolidate mock module setup for unstructured inference across multiple test files * 🔧 Test: Remove unused optional SDK mocks from test configuration * 🔧 Refactor: Clean up imports and enhance dynamic loading of fastmcp components in Docker client * 📦update:sdk dependence update * Add CAS SSO integration and improve logout handling (#3072) * feat: add CAS SSO integration * Skip CAS logout when CAS_LOGOUT_URL is unset * 取消转义 * Improve CAS logout handling and confirm user logout * Disable account deletion for CAS users * Add CAS session init SQL and k8s config * clean code * Remove agent guardrails design doc from tracking * 补充文档 --------- Co-authored-by: hhhhsc <name> * 🐛Bugfix: Remove unnecessary dependency exclusions and upgrade huggingface_hub version in pyproject.toml (#3211) * refactor: move current time from system prompt to user message for prompt cache stability (#3203) Remove {{time}} from all 4 prompt YAML templates (manager/managed × en/zh) and strip time_str from the context_utils pipeline (_format_app_context, build_skeleton_header_component, build_context_components, build_app_context_string). Also remove time from create_agent_info render kwargs and build_context_components call. In CoreAgent.run, prepend [Current time: ...] to self.task so the timestamp travels with the user message instead of being baked into the system prompt. This makes the rendered system prompt fully deterministic per (agent_id, tenant_id, version_no, language) — enabling prompt/KV cache hits across requests for the same agent config. Sync test_context_utils.py: drop time_str= from 3 test cases. Remove unused datetime imports from context_utils.py and create_agent_info.py. * 🐛 Bugfix: Fixed the issue of being unable to add MCP services via containerization. (#3213) [Specification Details] 1. Modify the DEFAULT_NETWORK_NAME when starting the MCP service in the container to match the name in docker-compose. 2. Modify the parameters passed to the add_mcp_service method; custom_headers defaults to None. * 🐛 Bugfix: Fixed the issue where uploaded text files could not be parsed during a session. (#3219) * 🐛 Bugfix: Fixed the issue where uploaded text files could not be parsed during a session. [Specification Details] 1. The return parameter of the file_process method has changed and needs to be unpacked. * 🐛 Bugfix: Fixed the issue where uploaded text files could not be parsed during a session. [Specification Details] 1. Modify test case. * 🐛 Bugfix: Fixed an issue where the MCP service could not be added correctly after updating the FastMCP version. (#3222) [Specification Details] 1. Add `kwargs` to the `create_httpx_client` function to accept all additional parameters. * 🐛 Bugfix: Fix incomplete display of tenant resources page after window resize (#3215) * Move non-shadcn ui component to other folder * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix incomplete display of tenant resources page after window resize * Add agent marketplace repository and version pinning for sub-agents (#3239) * feat: add agent marketplace repository and pin sub-agent versions at publish Introduce ag_agent_repository_t with list/status/publish/import APIs for frozen agent snapshots. Pin selected_agent_version_no on agent relations when publishing so sub-agents resolve to a fixed version at runtime. Extend agent export/import to bundle skills in ZIP payloads and add embedding model fallback when no model name is provided. * feat: add agent marketplace repository and pin sub-agent versions at publish Introduce ag_agent_repository_t with list/status/publish/import APIs for frozen agent snapshots. Pin selected_agent_version_no on agent relations when publishing so sub-agents resolve to a fixed version at runtime. Extend agent export/import to bundle skills in ZIP payloads and add embedding model fallback when no model name is provided. * feat: add agent marketplace repository and pin sub-agent versions at publish Introduce ag_agent_repository_t with list/status/publish/import APIs for frozen agent snapshots. Pin selected_agent_version_no on agent relations when publishing so sub-agents resolve to a fixed version at runtime. Extend agent export/import to bundle skills in ZIP payloads and add embedding model fallback when no model name is provided. * feat: add agent marketplace repository and pin sub-agent versions at publish Introduce ag_agent_repository_t with list/status/publish/import APIs for frozen agent snapshots. Pin selected_agent_version_no on agent relations when publishing so sub-agents resolve to a fixed version at runtime. Extend agent export/import to bundle skills in ZIP payloads and add embedding model fallback when no model name is provided. * feat(agent): add verification configuration for agents and update related components (#3174) * feat(agent): add verification configuration for agents and update related components * feat(model): update model type labels and add monitoring dashboard translations * 🐛 Bugfix: Fix inability to select agent from agent space to edit (#3240) * Move non-shadcn ui component to other folder * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix inability to select agent from agent space to edit * Bugfix: Display correct version info when viewing agent details * Update data agent and ME CAS integration documentation (#3242) * 补充dataagent对接文档 * 补充ME cas对接文档 * 补充ME cas对接文档 --------- Co-authored-by: hhhhsc <name> * ✨ Add several northbound apis (#3223) * ✨ Add several northbound apis * ✨ Add several northbound apis * ✨ Add several northbound apis * ✨ Add several northbound apis * ✨ Add several northbound apis * refactor: simplify deployment script by removing unused variables and functions (#3245) * feat(agent): add verification configuration for agents and update related components * feat(model): update model type labels and add monitoring dashboard translations * refactor(build_offline_package): simplify deployment script by removing unused variables and functions * 🐛 Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field (#3246) * Move non-shadcn ui component to other folder * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix incomplete display of tenant resources page after window resize * Bugfix: Fix inability to select agent from agent space to edit * Bugfix: Display correct version info when viewing agent details * Bugfix: Adjust agent detail UI layout to accommodate newly added "self-verification" field * 补充sql (#3248) * 补充sql * 扩大limit限制 * 🐛 Bugfix: Fixed an issue where the MCP service failed to start in a Kubernetes container. (#3254) [Specification Details] 1. Modify the pod naming logic to convert all non-compliant characters to -. 2. Modify test cases. * 🐛 Bugfix: knowledge_base_search_tool called with TypeError: argument of type 'FieldInfo' is not iterable (#3259) * 🐛 Bugfix: Fixed an issue where the one-click rename function failed after importing an agent. (#3258) [Specification Details] 1. The frontend does not pass `agent_id` when calling the `regenerate_name` API. * Bugfix: Exclude attachments from assistant when saving conversation history (#3261) * Bump APP_VERSION from v2.2.0 to v2.2.1 (#3268) The default setting for client-side self-validation is "False". --------- Co-authored-by: chase <byzhangxin11@126.com> Co-authored-by: Chenlifeng <174292121+Lifeng-Chen@users.noreply.github.com> Co-authored-by: Dallas98 <40557804+Dallas98@users.noreply.github.com> Co-authored-by: Jason Wang <56037774+JasonW404@users.noreply.github.com> Co-authored-by: Xia Yichen <iamjasonxia@126.com> Co-authored-by: JeffWu <45140512+jeffwu-1999@users.noreply.github.com> Co-authored-by: WMC001 <46217886+WMC001@users.noreply.github.com> Co-authored-by: xuyaqi <xuyaqist@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: DongJiBao2001 <120021235+DongJiBao2001@users.noreply.github.com> Co-authored-by: hhhhsc701 <56435672+hhhhsc701@users.noreply.github.com> Co-authored-by: Dallas98 <990259227@qq.com> Co-authored-by: frr <64584192+wuyuanfr@users.noreply.github.com> * Revert "Release/v2.2.1 (#3269)" (#3272) This reverts commit 9ff420e. * ✨ Feature: add agent repository page and APIs Introduce Agent Repository backend APIs, database/service support, frontend views, client services, and tests. Migrate Agent Space navigation and permissions to /agent-repository with updated SQL and localization. * ✨ Feature: add agent repository page and APIs Introduce Agent Repository backend APIs, database/service support, frontend views, client services, and tests. Migrate Agent Space navigation and permissions to /agent-repository with updated SQL and localization. * ✨ Feature: add agent repository page and APIs Introduce Agent Repository backend APIs, database/service support, frontend views, client services, and tests. Migrate Agent Space navigation and permissions to /agent-repository with updated SQL and localization. * ✨ Feature: add agent repository page and APIs Introduce Agent Repository backend APIs, database/service support, frontend views, client services, and tests. Migrate Agent Space navigation and permissions to /agent-repository with updated SQL and localization. * ✨ Feature: add agent repository page and APIs Introduce Agent Repository backend APIs, database/service support, frontend views, client services, and tests. Migrate Agent Space navigation and permissions to /agent-repository with updated SQL and localization. --------- Co-authored-by: panyehong <91180085+YehongPan@users.noreply.github.com> Co-authored-by: chase <byzhangxin11@126.com> Co-authored-by: Dallas98 <40557804+Dallas98@users.noreply.github.com> Co-authored-by: Jason Wang <56037774+JasonW404@users.noreply.github.com> Co-authored-by: Xia Yichen <iamjasonxia@126.com> Co-authored-by: JeffWu <45140512+jeffwu-1999@users.noreply.github.com> Co-authored-by: WMC001 <46217886+WMC001@users.noreply.github.com> Co-authored-by: xuyaqi <xuyaqist@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: DongJiBao2001 <120021235+DongJiBao2001@users.noreply.github.com> Co-authored-by: hhhhsc701 <56435672+hhhhsc701@users.noreply.github.com> Co-authored-by: Dallas98 <990259227@qq.com> Co-authored-by: frr <64584192+wuyuanfr@users.noreply.github.com>

* refactor context manager assembly for W3 * test: align W3 context runtime unit tests * fix: mount conversation context manager in runtime * fix: address sonarcloud context quality issues * fix: reduce OpenAIModel constructor parameter count * test: reduce duplicated context setup * test: cover input budget resolver handoff * fix: isolate managed context runtime state

…ions (#3306) * Add offline package compression and pull skipping * ✨ Update installation and deployment instructions for Docker and Kubernetes --------- Co-authored-by: hhhhsc <name>

- Add file attachment upload/preview/remove UI in debug panel - Upload files to MinIO and pass minio_files in agent run params - Support file attachments in both debug and compare modes - Include attachment info in conversation history - Update data_process_service to return img_info alongside chunks - Make object_name/presigned_url optional in conversationService types

…f_add_support_for_uploading_files

DongJiBao2001 and others added 16 commits June 18, 2026 10:20

Revert "Release/v2.2.1 (#3270)" (#3274)

2b1ae47

This reverts commit 20af495.

🐛 Bugfix: Multimodal tools support user model selection (#3249)

068b418

* 111 * issue_solve * testcase_fix * test_fix * Remove unrelated unstructured filename metadata change

♻️ Improvement: The default setting for self-verification upon agent …

3bf2760

…creation should be "False" (#3284)

🐛 Bugfix: Fixed an issue where the created_by field was not written…

9d2ef87

… when publishing an agent version. (#3287)

🧪Test: aidp interface test and bugfix (#3290)

d103d17

* 🐛 Bugfix: Update HTTP client settings to increase timeout and disable SSL verification in aidp_service and aidp_search_tool (#3280) * 🐛 Bugfix: Fix page show

Fix OpenAI LLM test memory exhaustion (#3291)

f95e6d1

Bugfix: Fix inability to copy content to clipboard in http (#3292)

89039de

jeffwu-1999 requested review from Dallas98 and WMC001 as code owners June 25, 2026 12:04

jeffwu-1999 force-pushed the wzf_add_support_for_uploading_files branch from f29d489 to ea1d235 Compare June 25, 2026 12:16

hhhhsc701 and others added 5 commits June 26, 2026 10:28

Add offline package compression and update Docker/Kubernetes instruct…

f28bae8

…ions (#3306) * Add offline package compression and pull skipping * ✨ Update installation and deployment instructions for Docker and Kubernetes --------- Co-authored-by: hhhhsc <name>

jeffwu-1999 force-pushed the wzf_add_support_for_uploading_files branch from ea1d235 to a73b04f Compare June 26, 2026 07:40

Merge branch 'develop' of github.com:ModelEngine-Group/nexent into wz…

87cc4bf

…f_add_support_for_uploading_files

jeffwu-1999 closed this Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add file upload support for agent debug mode#3300

add file upload support for agent debug mode#3300
jeffwu-1999 wants to merge 22 commits into
mainfrom
wzf_add_support_for_uploading_files

jeffwu-1999 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Uh oh!

Conversation

jeffwu-1999 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants