[pull] main from microsoft:main#1412
Merged
Merged
Conversation
The configuration "js/ts.tsdk.path" has an incorrect description, as it referenced the deprecated "typescript.tsdk" description.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
It means localization will be kept intact.
…rk invocation policy
Previously the background todo agent would only fire after 3 *mutating*
tool calls (edits, terminal runs, etc). Pure-exploration sessions — where
the agent reads dozens of files before writing anything — never produced a
todo list because context-only calls were excluded from the threshold.
Changes:
- Collapse ToolCategory from 'context' | 'meaningful' | 'excluded' to
'substantive' | 'excluded'. All non-infrastructure tool calls (reads,
edits, searches, terminal, subagents, browser, GitHub) now count as
substantive progress signals.
- Remove the CONTEXT_TOOLS allowlist and the unused CONTEXT_TOOL_CALL_THRESHOLD.
- Replace MEANINGFUL_TOOL_CALL_THRESHOLD with two named thresholds:
- INITIAL_SUBSTANTIVE_THRESHOLD = 1: fire on the first substantive call
when no todo list exists yet (fast path for exploration sessions).
- SUBSEQUENT_SUBSTANTIVE_THRESHOLD = 3: subsequent passes require 3
new substantive calls so the plan isn't re-rendered after every grep.
- Policy now uses _hasCreatedTodos to pick which threshold applies.
- Decision reasons renamed: meaningfulActivity → initialActivity /
substantiveActivity; contextOnlyWaiting → belowThreshold.
- IBackgroundTodoDeltaMetadata: meaningfulToolCallCount +
contextToolCallCount → substantiveToolCallCount.
- All debug log strings updated accordingly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ew policy
- classifyTool test: update expected values from 'context'/'meaningful'
to 'substantive' for all non-excluded tools.
- backgroundTodoPolicy.spec.ts:
- Remove context-only-waiting test suite (contextOnlyWaiting no longer
exists; reads now count as substantive).
- Add 'runs on first read-only call when no todos exist yet' test to
cover the key new behaviour: pure-exploration sessions get an
initialActivity pass on the first substantive call.
- Add 'waits when delta contains only excluded tools' to verify
infrastructure-only deltas still don't trigger a pass.
- Add subsequent-pass tests: waits below threshold=3, runs at threshold=3,
with mixed substantive calls.
- Update all dummyMeta literals: meaningfulToolCallCount/contextToolCallCount
→ substantiveToolCallCount; decisions/reasons updated to match.
- backgroundTodoProcessor.spec.ts: update makeDelta metadata literal.
- backgroundTodoHistory.spec.ts: update category literals in IBackgroundTodoHistoryRound
fixtures from 'context'/'meaningful' to 'substantive'.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Increase INITIAL_SUBSTANTIVE_THRESHOLD from 1 to 3 and SUBSEQUENT_SUBSTANTIVE_THRESHOLD from 3 to 7 to reduce spurious background passes triggered by short bursts of read-only activity. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fix four classes of incorrect behaviour in the background todo agent:
1. Completed items dropped from the list
- Add explicit rule: never silently drop existing items, especially
completed ones; the current todo list is authoritative for status.
- Repeat this rule in the final-review message.
2. Completed items re-marked as in-progress
- Remove the rules that forced exactly one 'in-progress' item at all
times ('if unfinished, one must be in-progress', 'never zero
in-progress with unfinished items').
- Replace with: zero 'in-progress' is valid both before work starts
and after all work is done.
- Strengthen the no-regression rule with explicit wording.
3. Completed items appearing after unfinished items
- Add explicit display-order rule: completed → in-progress →
not-started.
4. Forced sequential processing
- Remove 'Sequential state rules' section that required items to be
completed in list order.
- Replace with 'State rules' that allow work in any order.
Additionally:
- Tighten creation threshold: distinguish between volume of activity
(many tool calls, many files read) and genuinely multi-step work.
Add 'Primary signal is the NATURE of the work' section.
- Strengthen silence enforcement: reframe the opening instruction as an
explicit pre-call self-check question; add 'most common case' framing
to set the expectation that calling the tool is the exception.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
INITIAL_SUBSTANTIVE_THRESHOLD was raised from 1 to 3 and SUBSEQUENT_SUBSTANTIVE_THRESHOLD from 3 to 7. Tests that hardcoded specific call counts now derive counts from the static constants so they stay correct regardless of future threshold changes. - Replace single-round threshold tests with constant-driven rounds. - Fix round ID collision in 'subsequent threshold' test (round IDs used in the initial-pass simulation were reused in the follow-up check, causing the delta tracker to skip them as already processed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…15722) Fire _onDisposed before disposing xterm so that contributions clean up their xterm addons while the raw terminal is still alive. Previously, xterm was disposed first, which caused AddonManager to remove addons from its internal list. When contributions subsequently tried to dispose their own addons, _wrappedAddonDispose could not find them in the list and threw 'Could not dispose an addon that has not been loaded'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Ship stable symbol tool descriptions Make the cache-stable symbol tool behavior the default by always registering rename and usages tools with static descriptions. Remove the experimental setting now that the treatment is shipping. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Trigger CI rerun Empty commit to rerun PR checks after an infrastructure setup failure. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The serve-web command's websocket proxy spawned the client-side hyper connection without `.with_upgrades()`, so `hyper::upgrade::on(&mut res)` rejected the upgrade with "upgrade expected but low level API in use" and the websocket failed to establish. - Spawn `connection.with_upgrades()` in `forward_ws_req_to_server` to mirror the server side and the equivalent agent-host proxy. Fixes #315448 (Commit message generated by Copilot)
* agents: add keyboard-interactive SSH auth fallback When connecting to a configured SSH host in Agent mode, ssh2 only attempted SSH agent + default identity files. For hosts that require password / 2FA via keyboard-interactive (working fine via the OpenSSH CLI), all attempts failed with 'All configured authentication methods failed'. This appends a 'keyboard-interactive' attempt as the final fallback in Agent mode. When ssh2 picks it, the main process emits a request event that the renderer bridges to one prompt per serverIQuickInputService challenge, masked when echo is false. Cancellation (user dismissal or underlying connect failure) sends empty responses so ssh2 surfaces a proper auth failure instead of hanging. (Written by Copilot) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * address review: cancel pending kbi prompt on connect abort - Main side: cancelLiveKbiRequests now also invokes ssh2's stored finish callback with an empty array, so ssh2 stops waiting on this attempt instead of hanging until readyTimeout when a connect attempt is aborted mid-prompt. - Renderer side: pass a CancellationToken into IQuickInputService.input so an in-flight prompt is dismissed immediately when the main side cancels, instead of relying on the user to interact with stale UI. (Written by Copilot) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: split toAuthMethod and isMethodAllowedByServer out of makeAuthHandler Extracts two pure helpers so the iteration loop reads top-to-bottom without nested if/else branches around callback construction: - isMethodAllowedByServer encapsulates the agent->publickey aliasing - toAuthMethod maps SSHAuthAttempt to ssh2's payload, including the unavoidable kbi prompt-bridge closure (now isolated, not tangled with iteration state) No behavior change. (Written by Copilot) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add GPT-5.5 prompt experiment flags * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Strip Copilot Memory (CAPI) feature entirely Removes the CAPI-backed Copilot Memory that synced repository-scoped facts to GitHub. The local file-based MemoryTool with user/session/repo scopes remains as the sole memory mechanism. - Delete AgentMemoryService and its test. - Remove the github.copilot.chat.copilotMemory.enabled setting and its NLS string. - Remove ConfigKey.CopilotMemoryEnabled. - Strip all CAPI gating in memoryTool.tsx, memoryContextPrompt.tsx, tools.ts. - Drop _dispatchRepoCAPI / _repoCreate / _sendRepoTelemetry. - /memories/repo/ now always routes to local storage. - Update memoryTool.spec.tsx: remove mock CAPI services and CAPI-only tests. - Update simulationExtHostToolsService.ts for the new ToolsContribution arity.
…ion-suggestions fix: background todo agent prompt correctness and noise reduction
…te (#315853) * cli: enable upgrades on proxied websocket client connection The serve-web command's websocket proxy spawned the client-side hyper connection without `.with_upgrades()`, so `hyper::upgrade::on(&mut res)` rejected the upgrade with "upgrade expected but low level API in use" and the websocket failed to establish. - Spawn `connection.with_upgrades()` in `forward_ws_req_to_server` to mirror the server side and the equivalent agent-host proxy. Fixes #315448 (Commit message generated by Copilot) * agentHost: derive active-session count from authoritative session state The CLI agent host wasn't auto-updating because `--enable-remote-auto-shutdown` never fired: the active-session count in `AgentHostStateManager` could drift above zero and stay stuck there, keeping `ServerAgentHostManager`'s lifetime token held forever. The drift came from `_activeTurnToSession`, a parallel `Map<turnId, sessionUri>` that ran alongside the reducer's authoritative `state.activeTurn`. The two could diverge: - A `SessionTurnComplete` with a stale turn-id no-ops in `endTurn` but still decremented the map, under-counting active turns. - A second `SessionTurnStarted` on a session whose previous turn never completed added a new map entry while the reducer just overwrote `activeTurn`, leaving the count permanently above the true number of active turns. - `removeSession()` deleted from the map but never dispatched `RootActiveSessionsChanged`, so any eviction-with-active-turn path (e.g. `agentSideEffects.removeSubagentSessions`) silently stranded the count. - Replace `_activeTurnToSession` with `_sessionsWithActiveTurn: Set<sessionUri>`, maintained by comparing `state.activeTurn` before/after the reducer runs. The count now only moves when the reducer actually transitions a session between "has active turn" and "no active turn", so mismatched-id and overwrite cases stay in sync with reality by construction. - Have `removeSession` clean up the set and dispatch `RootActiveSessionsChanged` if it actually removed an entry, so eviction paths release the lifetime token. - Regression tests for: stranded-on-eviction, stale `SessionTurnComplete`, and concurrent `SessionTurnStarted` on the same session. Fixes #315587 (Commit message generated by Copilot)
The Other Models section in the chat model picker grouped solely by vendor, so BYOK setups that register multiple user-configured groups under a single vendor (e.g. two customoai entries named 'OpenAI Compatible' and 'AWS Bedrock') collapsed into one section under the vendor's display name. This contradicted the model configuration view, which keys buckets on (vendor, group.name) via getProviderGroupId. Mirror that grouping in buildModelPickerItems by walking ILanguageModelsService.getLanguageModelGroups for each vendor and bucketing models on (vendor, groupName). The promoted-section badge uses the same distinctness check, so the inline source label disambiguates BYOK groups too. When no user-configured group is registered (built-in vendors), fall back to the vendor display name so existing single-section behavior is preserved. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add three tests that exercise the new (vendor, groupName) bucketing in buildModelPickerItems: a single vendor with multiple user-configured groups should produce per-group sections, a single-group vendor should not gain a header, and the promoted-section badge should carry the group name when groups disambiguate a vendor. Introduces a createLanguageModelsServiceStub helper so tests can declare per-vendor groups without spinning up the real service, and extends callBuild to accept a per-test languageModelsService override. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ndorGrouop Fix BYOK model picker grouping
When CAPI ends a stream with `response.incomplete` or `response.failed`, the parser previously returned undefined and the downstream chatMLFetcher saw zero completions and fell to `ChatFetchResponseType.Unknown` ("Sorry, no response was returned.").
Map terminal events into ChatCompletions with the right FinishedCompletionReason so the existing chatMLFetcher switch handles them:
- response.incomplete + content_filter -> ContentFilter (+ FilterReason from CAPI content_filters labels, e.g. TextCopyright -> Copyright)
- response.incomplete + max_output_tokens -> Length
- response.failed -> ServerError
* Add agent host headless terminal mirror * make things clearer * test agent host DSR response loopback
Memory tool is now always enabled. Removes the preview gate, the config key, the now-unused DI params on MemoryTool/MemoryContextPrompt/ MemoryInstructionsPrompt, and isAnthropicMemoryToolEnabled (replaced by modelSupportsMemory at the BYOK call site).
…order-315722-832b3fbaaaeab7db fix: reorder terminal disposal to prevent xterm addon error (fixes #315722)
* agent host: drive subagent cleanup from SDK completion events - Background subagents (e.g. Copilot's `task` tool with `mode: background`) continue running after their parent tool call returns. Previously we tore down the subagent session as soon as the parent tool's `SessionToolCallComplete` was dispatched, which dropped all later subagent events on the floor. Any `tool.execution_start` the subagent emitted afterwards (e.g. a `problems` call needing confirmation) got buffered indefinitely and never reached AHP, leaving the UI hung on confirmation. - Introduces a `subagent_completed` agent signal fired from the SDK's `subagent.completed` and `subagent.failed` events, and routes that signal to `completeSubagentSession`. The parent tool completion path now only drops the pending pre-start signal buffer, which keeps the "subagent never started" cleanup path intact without prematurely closing a still-running subagent. Fixes #314827 (Commit message generated by Copilot) * test: mock agent emits subagent_completed after inner tool The integration mock's `subagent` prompt previously relied on the parent `task` tool's `SessionToolCallComplete` to tear down the child session. With the subagent lifecycle now driven by the SDK's `subagent.completed` event, the mock must mirror that and emit `subagent_completed` itself, otherwise the child turn never finalizes and `turnExecution.integrationTest.ts` fails on `child subagent session should have at least one turn`.
fix: replace typescript.tsdk.desc with new js/ts.tsdk.path
* agent host: handle elicitation requests from copilot SDK
- Implements the new `onElicitationRequest` hook from the Copilot SDK so that the copilot agent can surface elicitations (free-form, schema, and URL) as session input requests, dispatching the agent's reply back through the SDK once the user responds.
- Renders URL-mode elicitations in the chat UI as a proper `ChatElicitationRequestPart` that opens the URL via `IOpenerService` and dispatches Accept/Decline/Cancel, instead of falling through to the question-carousel fallback which was not designed for URL approvals.
- Adds unit tests covering the new agent-side elicitation flow as well as the workbench-side URL elicitation rendering, opener integration, decline/cancel paths, and external-completion echoes.
Fixes <no issue>
(Commit message generated by Copilot)
* address review feedback for url elicitation handling
- settle() is idempotent without short-circuiting on cancellation
- map server-side SessionInputCompleted to ElicitationState before hide()
- check IOpenerService.open() return value; treat false as Decline
- tighten boolean/number text coercion in elicitationAnswerToFieldValue
- handle free-form (no schema) accept by returning { answer: text }
- add tests covering server-side dismissal (Cancel) and opener=false
…back Address PR feedback and round out the content-filter mapping: - Map response.error (string-coded per OpenAI SDK) onto APIErrorResponse so it propagates through ChatCompletion.error -> ChatFetchResult.streamError and BYOK callers see the underlying server-side reason. The numeric code field can't hold OpenAI's string enum, so the string is stashed in metadata.code (BYOK JSON.stringify's the whole struct). - Add protected_material_text / protected_material_code to the structured content_filter_results fallback. These are the Azure REST spec's copyright detectors; today the wire emits the legacy CAPI 'content_filter_raw' shape with action=BLOCK + label=TextCopyright, but the spec'd path is what Azure is rolling toward. - Document the wire-vs-spec distinction on CapiResponseTerminalEvent. - Type the test's completions array as ChatCompletion[] (avoid any[]). - Add a regression test asserting response.failed.error propagates.
Plumbs an optional 'resource' field through IAuthenticationService get/createSession options and the authIssuers proposal so MCP authentication can request audience-restricted tokens. mainThreadMcp now forwards authDetails.resourceMetadata.resource into both calls. In the microsoft-authentication extension, the resource is threaded into MSAL's acquireTokenInteractive, acquireTokenByDeviceCode, and acquireTokenSilent. Bumps @azure/msal-node and @azure/msal-node-extensions to ^5.1.5; adapts to ServerAuthorizationCodeResponse -> AuthorizeResponse and fromNativeBroker -> fromPlatformBroker renames. Adds tests verifying that getSessions/createSession forward 'resource' to the provider, and that each MSAL flow (default, protocol handler, device code) forwards 'resource' to the underlying MSAL call.
#315886) Tool search is now always enabled for gpt-5.4/gpt-5.5, matching the messages API path. Aligns the responses API on the same endpoint.supportsToolSearch capability flag. Also registers ToolSearchTool for gpt-5.4/gpt-5.5 and the claude-opus-4.7 variants so model-specific tool gating actually matches the supported endpoints.
Responses API: translate terminal events into typed completions
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )