Skip to content

[pull] main from microsoft:main#1412

Merged
pull[bot] merged 44 commits into
KingDEV95:mainfrom
microsoft:main
May 12, 2026
Merged

[pull] main from microsoft:main#1412
pull[bot] merged 44 commits into
KingDEV95:mainfrom
microsoft:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 12, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

thernstig and others added 30 commits May 8, 2026 16:20
The configuration "js/ts.tsdk.path" has an incorrect description, as it referenced the
deprecated "typescript.tsdk" description.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
It means localization will be kept intact.
…rk invocation policy

Previously the background todo agent would only fire after 3 *mutating*
tool calls (edits, terminal runs, etc). Pure-exploration sessions — where
the agent reads dozens of files before writing anything — never produced a
todo list because context-only calls were excluded from the threshold.

Changes:
- Collapse ToolCategory from 'context' | 'meaningful' | 'excluded' to
  'substantive' | 'excluded'. All non-infrastructure tool calls (reads,
  edits, searches, terminal, subagents, browser, GitHub) now count as
  substantive progress signals.
- Remove the CONTEXT_TOOLS allowlist and the unused CONTEXT_TOOL_CALL_THRESHOLD.
- Replace MEANINGFUL_TOOL_CALL_THRESHOLD with two named thresholds:
  - INITIAL_SUBSTANTIVE_THRESHOLD = 1: fire on the first substantive call
    when no todo list exists yet (fast path for exploration sessions).
  - SUBSEQUENT_SUBSTANTIVE_THRESHOLD = 3: subsequent passes require 3
    new substantive calls so the plan isn't re-rendered after every grep.
- Policy now uses _hasCreatedTodos to pick which threshold applies.
- Decision reasons renamed: meaningfulActivity → initialActivity /
  substantiveActivity; contextOnlyWaiting → belowThreshold.
- IBackgroundTodoDeltaMetadata: meaningfulToolCallCount +
  contextToolCallCount → substantiveToolCallCount.
- All debug log strings updated accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ew policy

- classifyTool test: update expected values from 'context'/'meaningful'
  to 'substantive' for all non-excluded tools.
- backgroundTodoPolicy.spec.ts:
  - Remove context-only-waiting test suite (contextOnlyWaiting no longer
    exists; reads now count as substantive).
  - Add 'runs on first read-only call when no todos exist yet' test to
    cover the key new behaviour: pure-exploration sessions get an
    initialActivity pass on the first substantive call.
  - Add 'waits when delta contains only excluded tools' to verify
    infrastructure-only deltas still don't trigger a pass.
  - Add subsequent-pass tests: waits below threshold=3, runs at threshold=3,
    with mixed substantive calls.
  - Update all dummyMeta literals: meaningfulToolCallCount/contextToolCallCount
    → substantiveToolCallCount; decisions/reasons updated to match.
- backgroundTodoProcessor.spec.ts: update makeDelta metadata literal.
- backgroundTodoHistory.spec.ts: update category literals in IBackgroundTodoHistoryRound
  fixtures from 'context'/'meaningful' to 'substantive'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Increase INITIAL_SUBSTANTIVE_THRESHOLD from 1 to 3 and
SUBSEQUENT_SUBSTANTIVE_THRESHOLD from 3 to 7 to reduce spurious
background passes triggered by short bursts of read-only activity.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fix four classes of incorrect behaviour in the background todo agent:

1. Completed items dropped from the list
   - Add explicit rule: never silently drop existing items, especially
     completed ones; the current todo list is authoritative for status.
   - Repeat this rule in the final-review message.

2. Completed items re-marked as in-progress
   - Remove the rules that forced exactly one 'in-progress' item at all
     times ('if unfinished, one must be in-progress', 'never zero
     in-progress with unfinished items').
   - Replace with: zero 'in-progress' is valid both before work starts
     and after all work is done.
   - Strengthen the no-regression rule with explicit wording.

3. Completed items appearing after unfinished items
   - Add explicit display-order rule: completed → in-progress →
     not-started.

4. Forced sequential processing
   - Remove 'Sequential state rules' section that required items to be
     completed in list order.
   - Replace with 'State rules' that allow work in any order.

Additionally:
- Tighten creation threshold: distinguish between volume of activity
  (many tool calls, many files read) and genuinely multi-step work.
  Add 'Primary signal is the NATURE of the work' section.
- Strengthen silence enforcement: reframe the opening instruction as an
  explicit pre-call self-check question; add 'most common case' framing
  to set the expectation that calling the tool is the exception.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
INITIAL_SUBSTANTIVE_THRESHOLD was raised from 1 to 3 and
SUBSEQUENT_SUBSTANTIVE_THRESHOLD from 3 to 7. Tests that hardcoded
specific call counts now derive counts from the static constants so
they stay correct regardless of future threshold changes.

- Replace single-round threshold tests with constant-driven rounds.
- Fix round ID collision in 'subsequent threshold' test (round IDs
  used in the initial-pass simulation were reused in the follow-up
  check, causing the delta tracker to skip them as already processed).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…15722)

Fire _onDisposed before disposing xterm so that contributions clean up
their xterm addons while the raw terminal is still alive. Previously,
xterm was disposed first, which caused AddonManager to remove addons
from its internal list. When contributions subsequently tried to
dispose their own addons, _wrappedAddonDispose could not find them
in the list and threw 'Could not dispose an addon that has not been
loaded'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Ship stable symbol tool descriptions

Make the cache-stable symbol tool behavior the default by always registering rename and usages tools with static descriptions. Remove the experimental setting now that the treatment is shipping.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trigger CI rerun

Empty commit to rerun PR checks after an infrastructure setup failure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The serve-web command's websocket proxy spawned the client-side hyper connection without `.with_upgrades()`, so `hyper::upgrade::on(&mut res)` rejected the upgrade with "upgrade expected but low level API in use" and the websocket failed to establish.

- Spawn `connection.with_upgrades()` in `forward_ws_req_to_server` to mirror the server side and the equivalent agent-host proxy.

Fixes #315448

(Commit message generated by Copilot)
* agents: add keyboard-interactive SSH auth fallback

When connecting to a configured SSH host in Agent mode, ssh2 only
attempted SSH agent + default identity files. For hosts that require
password / 2FA via keyboard-interactive (working fine via the OpenSSH
CLI), all attempts failed with 'All configured authentication methods
failed'.

This appends a 'keyboard-interactive' attempt as the final fallback in
Agent mode. When ssh2 picks it, the main process emits a request event
that the renderer bridges to  one prompt per serverIQuickInputService
challenge, masked when echo is false. Cancellation (user dismissal or
underlying connect failure) sends empty responses so ssh2 surfaces a
proper auth failure instead of hanging.

(Written by Copilot)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* address review: cancel pending kbi prompt on connect abort

- Main side: cancelLiveKbiRequests now also invokes ssh2's stored
  finish callback with an empty array, so ssh2 stops waiting on this
  attempt instead of hanging until readyTimeout when a connect attempt
  is aborted mid-prompt.
- Renderer side: pass a CancellationToken into IQuickInputService.input
  so an in-flight prompt is dismissed immediately when the main side
  cancels, instead of relying on the user to interact with stale UI.

(Written by Copilot)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: split toAuthMethod and isMethodAllowedByServer out of makeAuthHandler

Extracts two pure helpers so the iteration loop reads top-to-bottom
without nested if/else branches around callback construction:

- isMethodAllowedByServer encapsulates the agent->publickey aliasing
- toAuthMethod maps SSHAuthAttempt to ssh2's payload, including the
  unavoidable kbi prompt-bridge closure (now isolated, not tangled
  with iteration state)

No behavior change.

(Written by Copilot)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add GPT-5.5 prompt experiment flags

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Strip Copilot Memory (CAPI) feature entirely

Removes the CAPI-backed Copilot Memory that synced repository-scoped facts
to GitHub. The local file-based MemoryTool with user/session/repo scopes
remains as the sole memory mechanism.

- Delete AgentMemoryService and its test.
- Remove the github.copilot.chat.copilotMemory.enabled setting and its NLS string.
- Remove ConfigKey.CopilotMemoryEnabled.
- Strip all CAPI gating in memoryTool.tsx, memoryContextPrompt.tsx, tools.ts.
- Drop _dispatchRepoCAPI / _repoCreate / _sendRepoTelemetry.
- /memories/repo/ now always routes to local storage.
- Update memoryTool.spec.tsx: remove mock CAPI services and CAPI-only tests.
- Update simulationExtHostToolsService.ts for the new ToolsContribution arity.
)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ion-suggestions

fix: background todo agent prompt correctness and noise reduction
…te (#315853)

* cli: enable upgrades on proxied websocket client connection

The serve-web command's websocket proxy spawned the client-side hyper connection without `.with_upgrades()`, so `hyper::upgrade::on(&mut res)` rejected the upgrade with "upgrade expected but low level API in use" and the websocket failed to establish.

- Spawn `connection.with_upgrades()` in `forward_ws_req_to_server` to mirror the server side and the equivalent agent-host proxy.

Fixes #315448

(Commit message generated by Copilot)

* agentHost: derive active-session count from authoritative session state

The CLI agent host wasn't auto-updating because `--enable-remote-auto-shutdown` never
fired: the active-session count in `AgentHostStateManager` could drift above zero and
stay stuck there, keeping `ServerAgentHostManager`'s lifetime token held forever.

The drift came from `_activeTurnToSession`, a parallel `Map<turnId, sessionUri>` that
ran alongside the reducer's authoritative `state.activeTurn`. The two could diverge:

- A `SessionTurnComplete` with a stale turn-id no-ops in `endTurn` but still
  decremented the map, under-counting active turns.
- A second `SessionTurnStarted` on a session whose previous turn never completed
  added a new map entry while the reducer just overwrote `activeTurn`, leaving the
  count permanently above the true number of active turns.
- `removeSession()` deleted from the map but never dispatched
  `RootActiveSessionsChanged`, so any eviction-with-active-turn path (e.g.
  `agentSideEffects.removeSubagentSessions`) silently stranded the count.

- Replace `_activeTurnToSession` with `_sessionsWithActiveTurn: Set<sessionUri>`,
  maintained by comparing `state.activeTurn` before/after the reducer runs. The
  count now only moves when the reducer actually transitions a session
  between "has active turn" and "no active turn", so mismatched-id and
  overwrite cases stay in sync with reality by construction.
- Have `removeSession` clean up the set and dispatch `RootActiveSessionsChanged`
  if it actually removed an entry, so eviction paths release the lifetime token.
- Regression tests for: stranded-on-eviction, stale `SessionTurnComplete`, and
  concurrent `SessionTurnStarted` on the same session.

Fixes #315587

(Commit message generated by Copilot)
The Other Models section in the chat model picker grouped solely by
vendor, so BYOK setups that register multiple user-configured groups
under a single vendor (e.g. two customoai entries named 'OpenAI
Compatible' and 'AWS Bedrock') collapsed into one section under the
vendor's display name. This contradicted the model configuration view,
which keys buckets on (vendor, group.name) via getProviderGroupId.

Mirror that grouping in buildModelPickerItems by walking
ILanguageModelsService.getLanguageModelGroups for each vendor and
bucketing models on (vendor, groupName). The promoted-section badge
uses the same distinctness check, so the inline source label
disambiguates BYOK groups too. When no user-configured group is
registered (built-in vendors), fall back to the vendor display name so
existing single-section behavior is preserved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add three tests that exercise the new (vendor, groupName) bucketing in
buildModelPickerItems: a single vendor with multiple user-configured
groups should produce per-group sections, a single-group vendor should
not gain a header, and the promoted-section badge should carry the
group name when groups disambiguate a vendor.

Introduces a createLanguageModelsServiceStub helper so tests can
declare per-vendor groups without spinning up the real service, and
extends callBuild to accept a per-test languageModelsService override.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ndorGrouop

Fix BYOK model picker grouping
zhichli and others added 14 commits May 11, 2026 14:17
When CAPI ends a stream with `response.incomplete` or `response.failed`, the parser previously returned undefined and the downstream chatMLFetcher saw zero completions and fell to `ChatFetchResponseType.Unknown` ("Sorry, no response was returned.").

Map terminal events into ChatCompletions with the right FinishedCompletionReason so the existing chatMLFetcher switch handles them:

- response.incomplete + content_filter -> ContentFilter (+ FilterReason from CAPI content_filters labels, e.g. TextCopyright -> Copyright)

- response.incomplete + max_output_tokens -> Length

- response.failed -> ServerError
* Add agent host headless terminal mirror

* make things clearer

* test agent host DSR response loopback
Memory tool is now always enabled. Removes the preview gate, the config
key, the now-unused DI params on MemoryTool/MemoryContextPrompt/
MemoryInstructionsPrompt, and isAnthropicMemoryToolEnabled (replaced by
modelSupportsMemory at the BYOK call site).
…order-315722-832b3fbaaaeab7db

fix: reorder terminal disposal to prevent xterm addon error (fixes #315722)
* agent host: drive subagent cleanup from SDK completion events

- Background subagents (e.g. Copilot's `task` tool with `mode: background`)
  continue running after their parent tool call returns. Previously we
  tore down the subagent session as soon as the parent tool's
  `SessionToolCallComplete` was dispatched, which dropped all later
  subagent events on the floor. Any `tool.execution_start` the subagent
  emitted afterwards (e.g. a `problems` call needing confirmation) got
  buffered indefinitely and never reached AHP, leaving the UI hung on
  confirmation.
- Introduces a `subagent_completed` agent signal fired from the SDK's
  `subagent.completed` and `subagent.failed` events, and routes that
  signal to `completeSubagentSession`. The parent tool completion path
  now only drops the pending pre-start signal buffer, which keeps the
  "subagent never started" cleanup path intact without prematurely
  closing a still-running subagent.

Fixes #314827

(Commit message generated by Copilot)

* test: mock agent emits subagent_completed after inner tool

The integration mock's `subagent` prompt previously relied on the
parent `task` tool's `SessionToolCallComplete` to tear down the
child session. With the subagent lifecycle now driven by the SDK's
`subagent.completed` event, the mock must mirror that and emit
`subagent_completed` itself, otherwise the child turn never
finalizes and `turnExecution.integrationTest.ts` fails on
`child subagent session should have at least one turn`.
fix: replace typescript.tsdk.desc with new js/ts.tsdk.path
* agent host: handle elicitation requests from copilot SDK

- Implements the new `onElicitationRequest` hook from the Copilot SDK so that the copilot agent can surface elicitations (free-form, schema, and URL) as session input requests, dispatching the agent's reply back through the SDK once the user responds.
- Renders URL-mode elicitations in the chat UI as a proper `ChatElicitationRequestPart` that opens the URL via `IOpenerService` and dispatches Accept/Decline/Cancel, instead of falling through to the question-carousel fallback which was not designed for URL approvals.
- Adds unit tests covering the new agent-side elicitation flow as well as the workbench-side URL elicitation rendering, opener integration, decline/cancel paths, and external-completion echoes.

Fixes <no issue>

(Commit message generated by Copilot)

* address review feedback for url elicitation handling

- settle() is idempotent without short-circuiting on cancellation
- map server-side SessionInputCompleted to ElicitationState before hide()
- check IOpenerService.open() return value; treat false as Decline
- tighten boolean/number text coercion in elicitationAnswerToFieldValue
- handle free-form (no schema) accept by returning { answer: text }
- add tests covering server-side dismissal (Cancel) and opener=false
…back

Address PR feedback and round out the content-filter mapping:

- Map response.error (string-coded per OpenAI SDK) onto APIErrorResponse so
  it propagates through ChatCompletion.error -> ChatFetchResult.streamError
  and BYOK callers see the underlying server-side reason. The numeric code
  field can't hold OpenAI's string enum, so the string is stashed in
  metadata.code (BYOK JSON.stringify's the whole struct).
- Add protected_material_text / protected_material_code to the structured
  content_filter_results fallback. These are the Azure REST spec's copyright
  detectors; today the wire emits the legacy CAPI 'content_filter_raw' shape
  with action=BLOCK + label=TextCopyright, but the spec'd path is what Azure
  is rolling toward.
- Document the wire-vs-spec distinction on CapiResponseTerminalEvent.
- Type the test's completions array as ChatCompletion[] (avoid any[]).
- Add a regression test asserting response.failed.error propagates.
Plumbs an optional 'resource' field through IAuthenticationService get/createSession options and the authIssuers proposal so MCP authentication can request audience-restricted tokens. mainThreadMcp now forwards authDetails.resourceMetadata.resource into both calls.

In the microsoft-authentication extension, the resource is threaded into MSAL's acquireTokenInteractive, acquireTokenByDeviceCode, and acquireTokenSilent. Bumps @azure/msal-node and @azure/msal-node-extensions to ^5.1.5; adapts to ServerAuthorizationCodeResponse -> AuthorizeResponse and fromNativeBroker -> fromPlatformBroker renames.

Adds tests verifying that getSessions/createSession forward 'resource' to the provider, and that each MSAL flow (default, protocol handler, device code) forwards 'resource' to the underlying MSAL call.
#315886)

Tool search is now always enabled for gpt-5.4/gpt-5.5, matching the
messages API path. Aligns the responses API on the same
endpoint.supportsToolSearch capability flag.

Also registers ToolSearchTool for gpt-5.4/gpt-5.5 and the
claude-opus-4.7 variants so model-specific tool gating actually
matches the supported endpoints.
Responses API: translate terminal events into typed completions
@pull pull Bot locked and limited conversation to collaborators May 12, 2026
@pull pull Bot added the ⤵️ pull label May 12, 2026
@pull pull Bot merged commit e565d7d into KingDEV95:main May 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.