From abe6f3130c6a8408a8207e7fa4578a06167ac089 Mon Sep 17 00:00:00 2001 From: "Christopher Krawczyk (SQL) (from Dev Box)" Date: Wed, 13 May 2026 09:10:40 -0700 Subject: [PATCH 1/7] Initialize PAW workflow for Fix Native TLS Reqwest Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../fix-native-tls-reqwest/WorkflowContext.md | 41 +++++++++++++++++++ 1 file changed, 41 insertions(+) create mode 100644 .paw/work/fix-native-tls-reqwest/WorkflowContext.md diff --git a/.paw/work/fix-native-tls-reqwest/WorkflowContext.md b/.paw/work/fix-native-tls-reqwest/WorkflowContext.md new file mode 100644 index 0000000..40ce46a --- /dev/null +++ b/.paw/work/fix-native-tls-reqwest/WorkflowContext.md @@ -0,0 +1,41 @@ +# WorkflowContext + +Work Title: Fix Native TLS Reqwest +Work ID: fix-native-tls-reqwest +Base Branch: main +Target Branch: fix/native-tls-reqwest-feature +Execution Mode: current-checkout +Repository Identity: github.com/microsoft/duroxide-pg@dbf922cdd64883871920019650e338a4f5af5ae1 +Execution Binding: none +Workflow Mode: minimal +Review Strategy: local +Review Policy: final-pr-only +Session Policy: continuous +Final Agent Review: disabled +Final Review Mode: multi-model +Final Review Interactive: smart +Final Review Models: latest GPT, latest Gemini, latest Claude Opus +Final Review Specialists: all +Final Review Interaction Mode: parallel +Final Review Specialist Models: none +Final Review Perspectives: auto +Final Review Perspective Cap: 2 +Implementation Model: none +Plan Generation Mode: single-model +Plan Generation Models: latest GPT, latest Gemini, latest Claude Opus +Planning Docs Review: disabled +Planning Review Mode: multi-model +Planning Review Interactive: smart +Planning Review Models: latest GPT, latest Gemini, latest Claude Opus +Planning Review Specialists: all +Planning Review Interaction Mode: parallel +Planning Review Specialist Models: none +Planning Review Perspectives: auto +Planning Review Perspective Cap: 2 +Custom Workflow Instructions: none +Initial Prompt: Add native-tls feature to reqwest dep so the Rust HTTP client has a TLS connector. Without this, every HTTPS call (incl. WorkloadIdentityCredential and AAD token endpoint) fails with hyper's 'invalid URL, scheme is not http' — making the Entra credential chain unusable on AKS pods. Reproduced E2E in PilotSwarm AKS deploy; verified fix preserves native-tls/openssl posture (no rustls/ring/aws-lc-rs added). +Issue URL: none +Remote: origin +Artifact Lifecycle: commit-and-clean +Artifact Paths: auto-derived +Additional Inputs: none From 18a18b88500f1f67e338fa4afc65d8a45ad28c09 Mon Sep 17 00:00:00 2001 From: "Christopher Krawczyk (SQL) (from Dev Box)" Date: Wed, 13 May 2026 09:12:38 -0700 Subject: [PATCH 2/7] paw: add CodeResearch.md for native-tls reqwest fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../fix-native-tls-reqwest/CodeResearch.md | 291 ++++++++++++++++++ 1 file changed, 291 insertions(+) create mode 100644 .paw/work/fix-native-tls-reqwest/CodeResearch.md diff --git a/.paw/work/fix-native-tls-reqwest/CodeResearch.md b/.paw/work/fix-native-tls-reqwest/CodeResearch.md new file mode 100644 index 0000000..a8d0f87 --- /dev/null +++ b/.paw/work/fix-native-tls-reqwest/CodeResearch.md @@ -0,0 +1,291 @@ +--- +date: 2026-05-13T07:50:00-07:00 +git_commit: b7a81df +branch: fix/native-tls-reqwest-feature +repository: microsoft/duroxide-pg +topic: "TLS connector missing from reqwest in the Entra credential chain" +tags: [research, codebase, entra, tls, cargo-features, reqwest, azure-identity] +status: complete +last_updated: 2026-05-13 +--- + +# Research: TLS connector missing from reqwest in the Entra credential chain + +## Research Question + +Document why HTTPS requests issued by the Entra credential chain in +`duroxide-pg` fail at the connector layer with hyper's +`"invalid URL, scheme is not http"` error inside an AKS pod, and capture the +exact dependency-graph state that produces that runtime behavior. + +## Summary + +`duroxide-pg`'s Entra credential chain +(`build_default_chained_credential` in `src/entra.rs:288-301`) delegates to +`azure_identity = 0.35`, whose `WorkloadIdentityCredential` and +`ManagedIdentityCredential` POST to AAD over HTTPS using a reqwest-backed +HTTP transport that `azure_core` constructs internally. In the crate's +current dependency configuration, the resolved `reqwest 0.13.3` build has +**no TLS backend feature enabled** — `default-features = false` flows in +from `typespec_client_core` and no other crate in the graph activates a TLS +feature on `reqwest`. As a result, the compiled reqwest client supports only +plain HTTP. Every HTTPS request the chain makes is rejected by the connector +layer before any network I/O occurs. + +The error surface that callers see is the `.context(...)` wrapper at +`src/provider.rs:246`: `"Entra credential resolution failed: could not +acquire an initial access token"`. The underlying chained-credential +aggregate message (assembled in `src/entra.rs:336-368`) and the +hyper/reqwest connector error are not exposed through `Display`, so the +true cause is invisible without `{:#}` formatting or a tracing subscriber. + +## Documentation System + +- **Framework**: Plain markdown +- **Docs Directory**: `docs/` (deep-dive notes), root for user-facing docs +- **Navigation Config**: N/A (no static-site generator) +- **Style Conventions**: README is the primary user surface; `docs/` holds + operational deep-dives (`docs/entra-ci.md`, `docs/duroxide-blockers.md`, + `docs/concurrent-instance-fetch-race.md`). Rustdoc on the public API + surface uses runnable `rust,no_run` examples under + `# async fn example()` to keep doctests typechecking. +- **Build Command**: `cargo doc --no-deps` for rustdoc +- **Standard Files**: `README.md` (root), `CHANGELOG.md` (root, semver-style + per-version sections, see existing entries through `0.1.32`) + +## Verification Commands + +- **Test Command**: `cargo nextest run` (preferred if installed) or + `cargo test` (fallback). Provider-validation suite: + `cargo nextest run --test postgres_provider_test`. Live-Entra suite is + opt-in via env var (see `tests/entra_live_test.rs:1-30`). +- **Lint Command**: `cargo clippy --all-targets` (per `.github/copilot- + instructions.md` style guidance; no explicit lint script is checked in) +- **Build Command**: `cargo build --release` +- **Type Check**: `cargo check --all-targets` (Rust does typecheck as part + of `cargo build`) +- **Dep-graph verification**: `cargo tree --target x86_64-unknown-linux-gnu + --edges normal -i ` (used below to verify TLS backend presence) + +## Detailed Findings + +### 1. Dependency declarations driving the resolved reqwest build + +`Cargo.toml:46-63` declares the Azure SDK and accompanying dependencies: + +- `Cargo.toml:46-59` — a multi-line comment block describing the intended + FIPS posture. The comment claims: + - That `default-features = false` on both Azure crates suppresses + `reqwest_rustls`. + - That the remaining features route HTTP transport through reqwest's own + `default-tls` feature, "keeping the build on native-tls across all + platforms." + - That `cargo tree --target x86_64-unknown-linux-gnu --all-features + --all-targets` was used to verify no rustls-based TLS crates appear. +- `Cargo.toml:60` — `azure_core = { version = "0.35", default-features = + false, features = ["reqwest", "reqwest_deflate", "reqwest_gzip", + "tokio"] }`. +- `Cargo.toml:61` — `azure_identity = { version = "0.35", default-features + = false, features = ["tokio"] }`. +- `Cargo.toml:62-63` — `futures-util` (unrelated to TLS). + +There is no explicit `reqwest` dependency in `[dependencies]`. + +### 2. Feature-graph trace from `azure_core/reqwest` to the resolved reqwest features + +Read from the registry sources resolved by `Cargo.lock`: + +- `~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/azure_core-0.35.0/Cargo.toml` + - `reqwest = ["typespec_client_core/reqwest"]` + - `reqwest_deflate = ["reqwest", "typespec_client_core/reqwest_deflate"]` + - `reqwest_gzip = ["reqwest", "typespec_client_core/reqwest_gzip"]` + - `reqwest_rustls = ["reqwest", "typespec_client_core/reqwest_rustls"]` + - `azure_core/reqwest` activates only the bare `dep:reqwest` path; no + TLS feature. +- `~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/typespec_client_core-0.14.0/Cargo.toml` + - `[features] reqwest = ["dep:reqwest"]` + - `[features] reqwest_rustls = ["reqwest", "reqwest/rustls"]` + - `[dependencies.reqwest] version = "0.13.2"`, `features = ["stream"]`, + `default-features = false`. + - The `typespec_client_core` crate is the only declared parent of the + optional `reqwest` dep; it pins `default-features = false`. +- `~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/reqwest-0.13.3/Cargo.toml` + - `default = ["default-tls"]` + - `default-tls = ["rustls"]` (in reqwest 0.13.3 the named "default-tls" + feature now resolves to rustls — it does **not** select native-tls). + - `native-tls = ["__native-tls", "__native-tls-alpn"]` + - `__native-tls = ["dep:hyper-tls", "dep:native-tls-crate", "__tls"]` + - `rustls = ["__rustls-aws-lc-rs", "dep:rustls-platform-verifier", + "__rustls"]` + - `__rustls = ["dep:hyper-rustls", "dep:tokio-rustls", "dep:rustls", ...]` + - The optional `hyper-tls` and `hyper-rustls` deps are gated under the + `__native-tls` and `__rustls` feature flags respectively. + +### 3. Resolved dependency tree on the current commit + +Run from a `linux/amd64` `rust:1.91-trixie` container against `b7a81df`: + +- `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i reqwest` + shows `reqwest v0.13.3` activated by `typespec_client_core` only. +- `cargo tree -i rustls` → `error: package ID specification 'rustls' did + not match any packages`. +- `cargo tree -i ring` → `did not match any packages`. +- `cargo tree -i aws-lc-rs` → `did not match any packages`. +- `cargo tree -i hyper-tls` → not in the resolved graph (verified by + absence in the full `cargo tree` output). +- `cargo tree -i hyper-rustls` → not in the resolved graph. +- `native-tls v0.2.18` is present, pulled by sqlx-postgres via + `runtime-tokio-native-tls` (the sqlx path, independent of reqwest). +- `tokio-native-tls v0.3.1` and `hyper-tls v0.6.0` are absent. + +Net resolved reqwest build configuration: `default-features = false`, +features `["stream"]` (from `typespec_client_core`) plus `deflate` and +`gzip` (from `azure_core/reqwest_deflate` and `azure_core/reqwest_gzip` → +`reqwest/deflate` and `reqwest/gzip`). No TLS feature is activated on +reqwest. + +### 4. Entra credential chain construction and usage + +- `src/entra.rs:22` — `use azure_identity::{DeveloperToolsCredential, + ManagedIdentityCredential, WorkloadIdentityCredential};` +- `src/entra.rs:25-30` — the audience constant + `https://ossrdbms-aad.database.windows.net/.default` used for all + default-chain token requests. +- `src/entra.rs:273-301` — + `fn build_default_chained_credential() -> azure_core::Result>`. + - `src/entra.rs:295-297` — + `WorkloadIdentityCredential::new(None)` is pushed onto the chain when + construction succeeds (the constructor returns `Err` when the WI env + vars are absent, so on developer machines it falls through silently). + - `src/entra.rs:298` — + `ManagedIdentityCredential::new(None)?` is pushed unconditionally. + - `src/entra.rs:299` — + `DeveloperToolsCredential::new(None)?` is pushed unconditionally. +- `src/entra.rs:315-368` — `struct ChainedCredential` and its + `TokenCredential` impl: + - `src/entra.rs:343-356` — sequential `get_token` calls per source, + collecting `format!("{name}: {e}")` into `errors`. + - `src/entra.rs:358-366` — final + `azure_core::Error::with_message_fn(ErrorKind::Credential, ...)` + producing the message + `"All chained Entra credentials failed to acquire a token:\n - : + \n - ..."`. + +### 5. Public Entra constructors on `PostgresProvider` + +- `src/provider.rs:232-255` — + `pub(crate) async fn new_with_entra_with_token_source(host, port, + database, user, schema_name, options, token_source, ssl_mode)`: + - `src/provider.rs:243-246` — `token_source.fetch_token(&[audience]) + .await.context("Entra credential resolution failed: could not acquire + an initial access token")?`. The `.context(...)` call wraps the inner + azure_core::Error with an anyhow context layer; only the outer layer + is shown by `Display`, so the chain aggregation from step 4 above is + invisible without `{:#}` formatting. +- The public `connectWithEntra` / `connectWithSchemaAndEntra` constructors + (referenced from the duroxide-node bindings) flow through this internal + method with `EntraTokenSource::default_chain()` as the token source. + +### 6. Runtime behavior observed inside an AKS pod + +Direct execution of `WorkloadIdentityCredential::new(None)` followed by +`get_token(&["https://ossrdbms-aad.database.windows.net/.default"], None)` +inside a chkrawps7 AKS pod (Debian 13 / trixie-slim, glibc 2.41, +libssl3/libcrypto3 present) — using a standalone reproducer compiled +against the local `duroxide-pg` workspace — returns this error tree: + +``` +WorkloadIdentityCredential ERR (Display): + WorkloadIdentityCredential authentication failed. retry policy expired + and the request will no longer be retried + To troubleshoot, visit https://aka.ms/azsdk/rust/identity/troubleshoot#workload + +WorkloadIdentityCredential ERR (Debug): + Error { context: CustomMessage(Custom { kind: Credential, error: + Error { context: CustomMessage(Custom { kind: Connection, error: + Error { context: CustomMessage(Custom { kind: Connection, error: + reqwest::Error { kind: Request, + url: "https://login.microsoftonline.com//oauth2/v2.0/token", + source: hyper_util::client::legacy::Error(Connect, + ConnectError("invalid URL, scheme is not http")) + } + }, "failed to execute `reqwest` request") } + }, "retry policy expired and the request will no longer be retried") } + }, "WorkloadIdentityCredential authentication failed. ...") +``` + +`hyper_util::client::legacy::Error(Connect, ConnectError("invalid URL, +scheme is not http"))` is hyper's identifying error when its connector +has no TLS support compiled in. + +For the same pod and the same audience, `ManagedIdentityCredential` is +able to reach IMDS over plain HTTP (`http://169.254.169.254`) and receives +a structured HTTP 400 BadRequest body ("The requested identity has not +been assigned to this resource") — confirming the runtime can issue HTTP +requests; only HTTPS fails. `DeveloperToolsCredential` fails because the +container image has no `az` / `azd` on PATH (expected). + +The Node-side `@azure/identity` package, run in the same pod against the +same audience, returns a 1952-byte bearer token from `WorkloadIdentity +Credential`, `ManagedIdentityCredential`, and `DefaultAzureCredential` +without modification — establishing that pod-level Workload Identity +plumbing, network egress to `login.microsoftonline.com`, and the federated +token file are all functioning. + +### 7. Existing test coverage of the Entra path + +- `tests/entra_live_test.rs:1-110` — opt-in live test gated on + `DUROXIDE_PG_ENTRA_LIVE_TEST=1`. Exercises Entra against a real Azure + Postgres instance using the developer-tools branch of the chain (relies + on `az login` in the CI environment). The Rust reqwest client is only + reached on the path that succeeds; failure modes of HTTPS calls are not + asserted, and `WorkloadIdentityCredential` is not exercised by any + in-repo test. +- `.github/workflows/entra-live-test.yml:1-113` — the CI workflow for the + above test. Authenticates via `azure/login@v2` (which sets up the Azure + CLI on the runner), so the credential chain resolves through + `DeveloperToolsCredential` rather than `WorkloadIdentityCredential`. + +## Code References + +- `Cargo.toml:46-63` — Azure SDK feature configuration and FIPS-intent + comment block. +- `Cargo.toml:60` — `azure_core` features. +- `Cargo.toml:61` — `azure_identity` features. +- `src/entra.rs:22` — credential imports. +- `src/entra.rs:25-30` — audience constant for PG AAD scope. +- `src/entra.rs:273-301` — default chain construction. +- `src/entra.rs:315-368` — `ChainedCredential` impl and aggregate error + formatting. +- `src/provider.rs:232-255` — `new_with_entra_with_token_source`. +- `src/provider.rs:246` — anyhow `.context(...)` wrapper that hides the + inner chain failure from `Display`. +- `tests/entra_live_test.rs:1-110` — live test (developer-tools branch + only). +- `.github/workflows/entra-live-test.yml:1-113` — CI auth path. + +## Architecture Documentation + +- The crate's stated FIPS-compliance posture is "native-tls/openssl only, + no rustls / ring / aws-lc-rs in the resolved graph" + (`Cargo.toml:48-59`). +- The chained credential design (`src/entra.rs:273-368`) mirrors the + spirit of `DefaultAzureCredential` from azure_identity in higher-level + SDKs but is implemented locally because azure_identity 0.35 does not + ship a `DefaultAzureCredential` aggregator. Sources are stored alongside + a static class-name string, and the first successful credential is + logged once per `ChainedCredential` instance via a `OnceLock` gate + (`src/entra.rs:317`, `src/entra.rs:346-352`). +- The Entra refresh-task panic guard relies on `futures-util`'s + `FutureExt::catch_unwind` (`Cargo.toml:62-63`). +- Provider-side error surfacing flows everything through `anyhow::Error` + with `.context(...)` wrappers (`src/provider.rs:246`). This loses the + underlying source chain at `Display`; only `{:#}` formatting or + `Error::source()` traversal surfaces the inner error. + +## Open Questions + +None blocking. The dependency graph, runtime behavior, and code paths are +all observable from the local checkout and the registry sources resolved +by `Cargo.lock`. From 73f77fffba0a740bc5c28017cd4c75b0d99f056d Mon Sep 17 00:00:00 2001 From: "Christopher Krawczyk (SQL) (from Dev Box)" Date: Wed, 13 May 2026 09:13:58 -0700 Subject: [PATCH 3/7] paw: add ImplementationPlan.md for native-tls reqwest fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../ImplementationPlan.md | 263 ++++++++++++++++++ 1 file changed, 263 insertions(+) create mode 100644 .paw/work/fix-native-tls-reqwest/ImplementationPlan.md diff --git a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md new file mode 100644 index 0000000..2aa3bc4 --- /dev/null +++ b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md @@ -0,0 +1,263 @@ +# Fix Native TLS Reqwest Implementation Plan + +## Overview + +Restore HTTPS capability to the Entra credential chain in `duroxide-pg` by +activating a TLS backend on the resolved `reqwest 0.13` build. The current +dependency graph leaves reqwest with no TLS connector, which causes every +HTTPS request issued by `WorkloadIdentityCredential`, `ManagedIdentity +Credential` (when going through AAD over HTTPS), and the +`ClientAssertionCredential` token endpoint to fail with hyper's +`"invalid URL, scheme is not http"` error at the connector layer — before +any network I/O occurs (`CodeResearch.md` §1, §3, §6). + +The fix is a single-line `Cargo.toml` change that adds an explicit +`reqwest` dependency with `features = ["native-tls"]`. Cargo feature +unification activates the native-tls (openssl-backed) connector on the +resolved reqwest build without disturbing any source code. Verified via +`cargo tree` that no `rustls`, `ring`, or `aws-lc-rs` crates are introduced, +preserving the crate's stated FIPS-compliance posture (`CodeResearch.md` +§3). + +## Current State Analysis + +- `Cargo.toml:60-61` declares `azure_core` and `azure_identity` with + `default-features = false` and a hand-picked feature list intended to + preserve native-tls. The accompanying comment block at `Cargo.toml:46-59` + states that "this routes through reqwest's own `default-tls` feature". +- The intent is incorrect on two grounds (`CodeResearch.md` §2): + 1. `azure_core/reqwest = ["typespec_client_core/reqwest"]` and + `typespec_client_core/reqwest = ["dep:reqwest"]` activate only the + bare optional reqwest dep; neither flag enables a TLS feature. + `typespec_client_core` declares the reqwest dep with + `default-features = false`, so reqwest's own defaults are suppressed. + 2. In `reqwest 0.13.3`, the `default-tls` feature resolves to **rustls**, + not native-tls — the name no longer means what it did in earlier + reqwest releases. So even if defaults were active, they would + contradict the stated FIPS posture. +- The Entra credential chain (`src/entra.rs:288-301`) builds correctly, + but every HTTPS call inside the chain's `get_token` implementation + fails at the reqwest connector. The aggregate failure message is + obscured by the `.context(...)` wrapper at `src/provider.rs:246`, so + callers see only `"Entra credential resolution failed: could not + acquire an initial access token"` without the underlying cause. +- Existing test coverage (`tests/entra_live_test.rs`, `.github/workflows/ + entra-live-test.yml`) authenticates via the Azure CLI (`az login`) on + the runner, which exercises `DeveloperToolsCredential`. The reqwest + HTTP transport on the WI / MI branches is never traversed by CI, which + is why the missing TLS feature was not caught upstream. + +## Desired End State + +`Cargo.toml` declares an explicit reqwest dependency with +`default-features = false, features = ["native-tls"]`. The resolved +dependency graph activates `hyper-tls`, `tokio-native-tls`, and +`native-tls-crate` (openssl-backed) without introducing `rustls`, +`hyper-rustls`, `tokio-rustls`, `ring`, or `aws-lc-rs`. The accompanying +comment block in `Cargo.toml` is updated to describe the actual feature +graph rather than the previous mistaken claim. + +A regression test asserts that the reqwest-backed HTTP transport used by +`azure_core` accepts an `https://` URL at the connector layer (i.e. does +not return the `"invalid URL, scheme is not http"` connect error). The +test runs offline and does not depend on Azure credentials. + +The fix is verifiable by: +- `cargo build --release` succeeds. +- `cargo nextest run` (or `cargo test`) passes including the new + regression test. +- `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i rustls` + / `-i ring` / `-i aws-lc-rs` each return "did not match any packages". +- `cargo tree -i hyper-tls` shows `hyper-tls v0.6.0` activated from + reqwest. +- Manual: a linux/amd64 binary built from `examples/entra_debug.rs` (the + reproducer used during diagnosis) executed inside the chkrawps7 AKS pod + shows `WorkloadIdentityCredential` returns a valid access token. + +The crate version is bumped to `0.1.33` and `CHANGELOG.md` documents the +bug fix. + +## What We're NOT Doing + +- Not modifying the Entra chain composition (`src/entra.rs:288-301`) — the + bug is purely a build-configuration issue, not a code defect. +- Not changing the `.context(...)` error-wrapping behavior at + `src/provider.rs:246`, even though it hides the underlying chain + failure. Improving error surfacing is a separate, larger concern; see + also the upstream comment hidden behind the wrapper. Tracked as a + candidate for follow-up. +- Not adding a Workload-Identity live CI test. The reproducer + (`examples/entra_debug.rs`) is preserved for local diagnosis but a full + AKS-in-CI WI runner is out of scope for a bug-fix release. +- Not changing the sqlx TLS path (`runtime-tokio-native-tls`) — that path + is unaffected; only the reqwest path used by `azure_core` is broken. +- Not bumping `azure_core` / `azure_identity` versions. The fix is + agnostic to those crate versions and preserves the pinned `0.35` + versions. + +## Phase Status +- [ ] **Phase 1: Apply Cargo.toml fix, regression test, version bump, CHANGELOG** +- [ ] **Phase 2: Documentation** — `Docs.md` operator-facing note + +## Phase Candidates +- [ ] Improve `provider.rs:246` error surfacing so the chained credential + aggregate (`src/entra.rs:358-366`) is visible at `Display`, not only at + `{:#}` or via `Error::source()` traversal. Today the actual chain + failure is invisible to callers. +- [ ] Add a Workload-Identity smoke test to CI (requires deployed AKS test + resource and a federated identity credential). Closes the test-coverage + gap that allowed this bug to ship. + +--- + +## Phase 1: Apply Cargo.toml fix, regression test, version bump, CHANGELOG + +### Objective + +Activate the native-tls connector on the resolved reqwest build through a +single `Cargo.toml` change. Add an offline regression test that exercises +the reqwest-backed HTTP transport's HTTPS connector. Bump the crate version +and add a `CHANGELOG.md` entry. + +### Changes Required + +- **`Cargo.toml`**: + - Add a new line to `[dependencies]` (adjacent to the `azure_core` / + `azure_identity` block) declaring + `reqwest = { version = "0.13", default-features = false, features = ["native-tls"] }`. + Cargo feature unification will activate the native-tls path on the + reqwest already pulled in by `typespec_client_core`. + - Update the comment block at `Cargo.toml:46-59` to: + - Drop the incorrect claim that "this routes through reqwest's own + `default-tls` feature". + - Explain that `typespec_client_core` declares reqwest with + `default-features = false`, so an explicit reqwest dep is required + to opt into a TLS backend. + - Note that `reqwest 0.13.3`'s `default-tls` resolves to rustls, so + the explicit `native-tls` feature is the correct way to preserve + the openssl posture. + - Reference the verification: `cargo tree -i rustls / ring / aws-lc-rs` + should all report "did not match any packages". + - Bump `package.version` from `0.1.32` to `0.1.33`. + +- **`tests/native_tls_regression.rs`** (new): a single + `#[tokio::test(flavor = "current_thread")]` that: + - Builds a `reqwest::Client` via `azure_core::http::new_http_client(None)` + (or, if that constructor signature is unstable across the 0.35 minor + range, builds via `reqwest::ClientBuilder::new().build()` and reaches + the same connector code path). + - Issues a single request to `https://127.0.0.1:1/` (a port chosen so the + request fails fast at TCP-connect time). + - Asserts the returned `reqwest::Error` is a *connection* failure (e.g. + `ConnectionRefused`, `Os { code: ECONNREFUSED, .. }`, or a generic + connect error whose `Display` does NOT contain the substring + `"scheme is not http"`. The negative assertion is the regression + contract: presence of that substring indicates the connector was + rejected before any network attempt, which is the symptom this fix + addresses. + - Includes a documentation comment explaining the regression scenario + and linking to `CHANGELOG.md` for the 0.1.33 entry. + + The test is offline, deterministic, fast (single connect attempt), + and self-contained — it does not require Azure, federated tokens, or + network egress to login.microsoftonline.com. + +- **`CHANGELOG.md`**: Add a new entry at the top, above the existing + `## 0.1.32 ...` section, following the existing convention. Suggested + content (one-line summary + a paragraph describing the symptom, root + cause, and fix): + - **Title**: `## 0.1.33 — (bug fix: native-tls reqwest feature)` + - **Section**: `### Fixed` with a bullet covering the broken-HTTPS + symptom, the missing `native-tls` feature on the resolved reqwest + build, and a one-line note that `WorkloadIdentityCredential` / + `ManagedIdentityCredential` (HTTPS-bound) chains were unreachable + before this fix. + +- **Tests**: `tests/native_tls_regression.rs` (above). No changes to + `tests/entra_live_test.rs` — that test exercises the developer-tools + branch and is unaffected by this fix. + +### Success Criteria + +#### Automated Verification + +- [ ] `cargo build --release` succeeds. +- [ ] `cargo nextest run` (or `cargo test`) passes the full suite, + including the new `tests/native_tls_regression.rs`. +- [ ] `cargo clippy --all-targets -- -D warnings` is clean. +- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i + rustls` returns "did not match any packages". +- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i + ring` returns "did not match any packages". +- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i + aws-lc-rs` returns "did not match any packages". +- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i + hyper-tls` shows `hyper-tls v0.6.0` activated by reqwest. + +#### Manual Verification + +- [ ] Build the existing reproducer + (`examples/entra_debug.rs`, preserved in the user's local checkout + outside `.paw/`) for `x86_64-unknown-linux-gnu` against the patched + crate. The binary, when executed inside the chkrawps7 pod, prints + `"WorkloadIdentityCredential OK: token_len= expires_on=..."` — i.e. + a valid bearer token is acquired. Before this fix the same binary + prints `"invalid URL, scheme is not http"` for WorkloadIdentity + Credential. +- [ ] `Cargo.lock` updates reflect the dep-graph additions + (`hyper-tls`, `tokio-native-tls`, `rustls-pki-types`) and the absence + of `rustls`, `hyper-rustls`, `tokio-rustls`, `ring`, `aws-lc-rs`. + +--- + +## Phase 2: Documentation + +### Objective + +Capture the as-built record for this fix in `Docs.md` for operator-facing +reference. No user-facing API changes — `README.md` is unaffected. + +### Changes Required + +- **`.paw/work/fix-native-tls-reqwest/Docs.md`**: Operator-facing note + capturing: + - One-paragraph symptom description (HTTPS calls from the Entra chain + failing with `"invalid URL, scheme is not http"`). + - The dependency-graph trace showing why reqwest was built without a TLS + backend (`azure_core/reqwest` → `typespec_client_core/reqwest` → + `dep:reqwest` with `default-features = false`). + - The fix (explicit `reqwest` dep with `features = ["native-tls"]`) and + the rationale (preserves the openssl posture; `reqwest 0.13.3`'s + `default-tls` is rustls, so it cannot be used as the workaround). + - Verification commands (the `cargo tree -i ...` invocations from the + success criteria). + - Cross-references to `CHANGELOG.md` 0.1.33 and the upstream PR. + - Implementer should load `paw-docs-guidance` at the start of this phase + for the Docs.md template and any cross-cutting conventions. + +- **`README.md`**: No changes. The bug fix is invisible at the public API + surface; the Entra section already documents the intended posture and + remains accurate post-fix. + +### Success Criteria + +#### Automated Verification + +- [ ] No new doc-build infrastructure exists; running `cargo doc --no-deps` + succeeds without warnings introduced by this work. + +#### Manual Verification + +- [ ] `Docs.md` is internally consistent with `CHANGELOG.md` 0.1.33 and + `Cargo.toml`'s updated comment block. +- [ ] All file:line references in `Docs.md` resolve to the corresponding + source locations on the head of `fix/native-tls-reqwest-feature`. + +--- + +## References + +- Issue: none (filed as bug observed during PilotSwarm AKS deploy; see + `Initial Prompt` in `.paw/work/fix-native-tls-reqwest/WorkflowContext.md`) +- Spec: not produced (minimal workflow) +- Research: `.paw/work/fix-native-tls-reqwest/CodeResearch.md` From 70bd0b907115919541de649f7071c9e3a78cb8b9 Mon Sep 17 00:00:00 2001 From: "Christopher Krawczyk (SQL) (from Dev Box)" Date: Wed, 13 May 2026 09:37:46 -0700 Subject: [PATCH 4/7] Fix: add native-tls feature to reqwest dep for Entra HTTPS The Entra credential chain failed with 'invalid URL, scheme is not http' at hyper's connector layer because azure_core/reqwest activated the optional reqwest dep via typespec_client_core with default-features = false, leaving the resolved reqwest with no TLS backend. Adds an explicit reqwest dep with features = ["native-tls"] so Cargo feature unification activates the openssl-backed connector. Verified that the resolved dep graph contains no rustls / ring / aws-lc-rs. Bumps version to 0.1.33 and adds a regression test in tests/native_tls_regression.rs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../ImplementationPlan.md | 2 +- CHANGELOG.md | 29 +++++++++++ Cargo.toml | 28 ++++++++--- tests/native_tls_regression.rs | 50 +++++++++++++++++++ 4 files changed, 102 insertions(+), 7 deletions(-) create mode 100644 tests/native_tls_regression.rs diff --git a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md index 2aa3bc4..0c3d9ad 100644 --- a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md +++ b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md @@ -96,7 +96,7 @@ bug fix. versions. ## Phase Status -- [ ] **Phase 1: Apply Cargo.toml fix, regression test, version bump, CHANGELOG** +- [x] **Phase 1: Apply Cargo.toml fix, regression test, version bump, CHANGELOG** - [ ] **Phase 2: Documentation** — `Docs.md` operator-facing note ## Phase Candidates diff --git a/CHANGELOG.md b/CHANGELOG.md index 466311d..0bade2b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,35 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.1.33] - 2026-05-13 + +### Fixed + +- **HTTPS connector missing from Entra credential chain.** Every HTTPS + request issued by the Microsoft Entra credential chain — including + `WorkloadIdentityCredential` (AKS / federated identity), + `ManagedIdentityCredential` (when going through AAD over HTTPS), and + the `ClientAssertionCredential` token endpoint — failed at hyper's + connector layer with `"invalid URL, scheme is not http"` before any + network I/O occurred. As a result, `PostgresProvider::new_with_entra` + and `new_with_schema_and_entra` were unreachable in production + topologies that rely on Workload Identity (the developer-tools + branch via `az login` was unaffected, which is why CI did not catch + the regression). Root cause: `azure_core`'s `reqwest` feature flows + through `typespec_client_core/reqwest`, which declares the optional + reqwest dep with `default-features = false`. With no TLS feature + enabled on the resolved reqwest build, hyper rejected every HTTPS URL + at the connector layer. Fixed by declaring an explicit + `reqwest = { version = "0.13", default-features = false, features = ["native-tls"] }` + dep in `Cargo.toml`. Cargo feature unification activates the + native-tls (openssl-backed) connector on the resolved reqwest build + without modifying any source code, preserving the crate's FIPS- + compliance posture (no rustls / ring / aws-lc-rs in the resolved + graph). Added a regression test + (`tests/native_tls_regression.rs`) that constructs the + `azure_core::http` client used by `azure_identity` and asserts the + TLS connector accepts `https://` URLs. + ## [0.1.32] - 2026-05-08 **Release:** diff --git a/Cargo.toml b/Cargo.toml index 9eff4f2..f5f6d05 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -3,7 +3,7 @@ members = [".", "pg-stress"] [package] name = "duroxide-pg" -version = "0.1.32" +version = "0.1.33" edition = "2021" authors = ["Affan Dar "] description = "A PostgreSQL-based provider implementation for Duroxide, a durable task orchestration framework" @@ -51,16 +51,32 @@ include_dir = "0.7" # their `reqwest_rustls` default feature, which would otherwise pull a # rustls-based TLS stack into the dependency graph. We enable only the # minimum needed for HTTP transport (`reqwest` + gzip/deflate) and `tokio` -# for the async runtime; this routes through reqwest's own `default-tls` -# feature, keeping the build on native-tls across all platforms. +# for the async runtime. # -# Verified with `cargo tree --target x86_64-unknown-linux-gnu --all-features -# --all-targets` and a fresh Cargo.lock: no rustls-based TLS crates appear in -# the resolved dependency graph. +# The explicit `reqwest` dep below is required: `azure_core/reqwest` activates +# `typespec_client_core/reqwest`, which activates the optional reqwest dep +# with `default-features = false`. Without this line, the resolved reqwest +# build has NO TLS backend compiled in — every HTTPS request fails at the +# hyper connector with `"invalid URL, scheme is not http"`, which breaks +# `WorkloadIdentityCredential`, `ManagedIdentityCredential` (HTTPS-bound) +# and the `ClientAssertionCredential` token endpoint. Note that in +# reqwest 0.13.3 the `default-tls` feature resolves to rustls, NOT +# native-tls, so we cannot rely on enabling reqwest defaults to preserve +# the openssl posture; we must select `native-tls` explicitly. Cargo +# feature unification merges this dep with `typespec_client_core`'s reqwest +# dep at the same resolved version, activating the openssl-backed +# connector throughout the graph. +# +# Verified with `cargo tree --target x86_64-unknown-linux-gnu --edges normal` +# and a fresh Cargo.lock: no rustls / ring / aws-lc-rs / hyper-rustls +# crates appear in the resolved dependency graph; `hyper-tls` + +# `tokio-native-tls` + `native-tls` are activated. azure_core = { version = "0.35", default-features = false, features = ["reqwest", "reqwest_deflate", "reqwest_gzip", "tokio"] } azure_identity = { version = "0.35", default-features = false, features = ["tokio"] } +reqwest = { version = "0.13", default-features = false, features = ["native-tls"] } # Used for FutureExt::catch_unwind in the Entra refresh task panic guard. futures-util = { version = "0.3", default-features = false, features = ["std"] } [dev-dependencies] duroxide-pg-stress = { path = "pg-stress" } +url = "2" diff --git a/tests/native_tls_regression.rs b/tests/native_tls_regression.rs new file mode 100644 index 0000000..6f23e39 --- /dev/null +++ b/tests/native_tls_regression.rs @@ -0,0 +1,50 @@ +// Regression test for the missing-TLS-backend bug in the resolved reqwest build. +// +// Background: `azure_core 0.35`'s `reqwest` feature activates +// `typespec_client_core/reqwest`, which activates the optional reqwest dep +// with `default-features = false`. Without an explicit reqwest feature +// declaration in this crate's Cargo.toml, the resolved reqwest is compiled +// with NO TLS backend, and every HTTPS request fails at hyper's connector +// with `"invalid URL, scheme is not http"` — before any network I/O. +// +// This breaks `WorkloadIdentityCredential`, `ManagedIdentityCredential` +// (HTTPS branch), and `ClientAssertionCredential` token endpoints — the +// entire Entra credential chain assembled in `src/entra.rs`. +// +// This test exercises the exact `azure_core::http::new_http_client()` +// constructor used by `azure_identity` against an `https://` URL. The URL +// targets a closed local port so the request must fail at TCP-connect, +// NOT at connector initialization. The regression contract: the error +// `Display` must NOT contain `"scheme is not http"` — that substring is +// hyper's signature for "no TLS connector compiled in". +// +// See `CHANGELOG.md` (0.1.33) and `Cargo.toml`'s comment block for the +// dep-graph trace and rationale. + +use azure_core::http::{new_http_client, Method, Request}; +use url::Url; + +#[tokio::test(flavor = "current_thread")] +async fn entra_https_connector_is_compiled_in() { + let client = new_http_client(None); + + // Closed local port — fail-fast at TCP connect, no DNS, no network egress. + let url = Url::parse("https://127.0.0.1:1/").expect("valid url"); + let req = Request::new(url, Method::Get); + + let err = client + .execute_request(&req) + .await + .expect_err("https://127.0.0.1:1/ must fail to connect"); + + // Format the full source chain — the bug's signature is buried under + // anyhow-style wrapping otherwise. + let chain = format!("{err:#}"); + assert!( + !chain.contains("scheme is not http"), + "Regression: reqwest has no TLS backend compiled in. \ + The dependency graph dropped its TLS connector — likely the explicit \ + `reqwest = {{ ..., features = [\"native-tls\"] }}` line was removed from \ + Cargo.toml. Full error chain:\n{chain}" + ); +} From 900a10f0603cb28656e0b3bb3521369fe05c8ad8 Mon Sep 17 00:00:00 2001 From: "Christopher Krawczyk (SQL) (from Dev Box)" Date: Wed, 13 May 2026 09:38:30 -0700 Subject: [PATCH 5/7] paw: docs phase - add Docs.md for native-tls fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .paw/work/fix-native-tls-reqwest/Docs.md | 187 ++++++++++++++++++ .../ImplementationPlan.md | 2 +- 2 files changed, 188 insertions(+), 1 deletion(-) create mode 100644 .paw/work/fix-native-tls-reqwest/Docs.md diff --git a/.paw/work/fix-native-tls-reqwest/Docs.md b/.paw/work/fix-native-tls-reqwest/Docs.md new file mode 100644 index 0000000..279bdc1 --- /dev/null +++ b/.paw/work/fix-native-tls-reqwest/Docs.md @@ -0,0 +1,187 @@ +# Docs: native-tls reqwest feature fix (duroxide-pg 0.1.33) + +## Symptom + +Every HTTPS request issued by the Microsoft Entra credential chain in +`duroxide-pg 0.1.31`–`0.1.32` failed at hyper's connector layer with: + +``` +reqwest::Error { + url: "https://login.microsoftonline.com/.../oauth2/v2.0/token", + source: hyper_util::client::legacy::Error( + Connect, + ConnectError("invalid URL, scheme is not http") + ) +} +``` + +The error originates at the connector layer — **before any DNS lookup or +TCP connect attempt**. The substring `"invalid URL, scheme is not http"` +is hyper's signature for "no TLS connector compiled into the HTTP client". + +Affected callers: `WorkloadIdentityCredential` (AKS Workload Identity), +`ManagedIdentityCredential` when the IMDS endpoint required HTTPS, and +the `ClientAssertionCredential` token endpoint used by federated +identity flows. `DeveloperToolsCredential` (the `az login` branch) was +unaffected because it shells out to the Azure CLI binary rather than +calling AAD over reqwest. + +Operational impact: `PostgresProvider::new_with_entra` and +`new_with_schema_and_entra` were unreachable in any production topology +that relied on Workload Identity — passwordless Entra was effectively +broken on AKS, Container Apps, and any other host that does not have the +Azure CLI installed locally. + +## Root cause + +The crate's `Cargo.toml` declared the Azure SDK dependencies as: + +```toml +azure_core = { version = "0.35", default-features = false, + features = ["reqwest", "reqwest_deflate", "reqwest_gzip", "tokio"] } +azure_identity = { version = "0.35", default-features = false, + features = ["tokio"] } +``` + +The intent — captured in the original comment block above those lines — +was to preserve a native-tls / openssl-only TLS posture for FIPS +compliance while still enabling reqwest as the HTTP transport. The +implementation was incorrect on two grounds: + +**1. `azure_core/reqwest` does not enable any reqwest defaults.** + +Tracing the Cargo feature graph for `azure_core 0.35`: + +- `azure_core/reqwest = ["typespec_client_core/reqwest"]` +- `typespec_client_core/reqwest = ["dep:reqwest"]` (activates the + optional dep only — no features) +- `typespec_client_core 0.14.0` declares + `reqwest = { version = "0.13", default-features = false, features = ["stream"] }` + +The resolved reqwest crate is built with `default-features = false` +plus `stream` only. No TLS backend is compiled in. The constructor at +`typespec_client_core::http::clients::reqwest::new_reqwest_client` +(used by `azure_core::http::new_http_client`) calls +`reqwest::ClientBuilder::new().build()` without configuring TLS +explicitly, so it inherits the crate's compile-time feature flags. With +no TLS feature active, the resulting hyper-util connector rejects every +HTTPS URL at request time. + +**2. `reqwest 0.13`'s `default-tls` is rustls, not native-tls.** + +The previous Cargo.toml comment claimed the configuration "routes through +reqwest's own `default-tls` feature, keeping the build on native-tls". +Even if `default-tls` had been active, it would NOT have routed through +native-tls — `default-tls` in reqwest 0.13.3 resolves to `rustls`. The +feature name was kept for compatibility but its meaning changed in the +reqwest 0.12 → 0.13 transition. + +## Why CI didn't catch it + +The only existing live test for the Entra path, +`tests/entra_live_test.rs`, authenticates via +`DeveloperToolsCredential` (`az login` on the CI runner). That +credential variant invokes the Azure CLI binary as a child process and +never traverses the reqwest-backed HTTP transport. The Workload Identity +branch of the credential chain has no live test in this repo, so the +missing TLS backend was invisible to CI. + +## Fix + +A single line added to `[dependencies]` in `Cargo.toml`: + +```toml +reqwest = { version = "0.13", default-features = false, features = ["native-tls"] } +``` + +Cargo's feature unification merges this declaration with +`typespec_client_core`'s reqwest dep (same major.minor version range, +same resolved 0.13.3) and the union of features — +`["stream", "native-tls"]` — is applied to the single resolved reqwest +crate. The native-tls feature activates `hyper-tls` and the +`native-tls-crate` (openssl-backed on Linux), giving hyper a working +TLS connector for HTTPS URLs. + +The accompanying comment block in `Cargo.toml:46-72` is updated to +describe the actual feature graph and the rationale for the explicit +dep. + +No source code changes are required. The fix is purely a build- +configuration adjustment. + +## Verification + +### Automated + +``` +cargo build --release +cargo test --release --test native_tls_regression +cargo clippy --all-targets +``` + +The new test at `tests/native_tls_regression.rs` constructs the +`azure_core::http` client via the same `new_http_client(None)` +constructor that `azure_identity` uses internally, then issues a GET +to `https://127.0.0.1:1/`. The request must fail (port 1 is closed) but +the failure must be a *network* error, not a *connector* error. The +regression contract asserts the error chain does NOT contain the +substring `"scheme is not http"` — that substring is the signature of +the bug being fixed. + +### Dependency graph + +``` +cargo tree --target x86_64-unknown-linux-gnu --edges normal -i rustls +cargo tree --target x86_64-unknown-linux-gnu --edges normal -i ring +cargo tree --target x86_64-unknown-linux-gnu --edges normal -i aws-lc-rs +cargo tree --target x86_64-unknown-linux-gnu --edges normal -i hyper-rustls +cargo tree --target x86_64-unknown-linux-gnu --edges normal -i tokio-rustls +``` + +Each command must return `"package ID specification ... did not match +any packages"` — the FIPS-compliance posture (no rustls-based TLS) is +preserved. + +``` +cargo tree --target x86_64-unknown-linux-gnu --edges normal -i hyper-tls +``` + +Must show `hyper-tls v0.6.0` activated by `reqwest`. + +### Live verification (manual) + +Build a Linux x86_64 binary of the diagnostic reproducer (kept locally +outside the repo at `examples/entra_debug.rs` during diagnosis) against +the patched crate, then execute it inside an AKS pod with Workload +Identity configured. The output must include +`"WorkloadIdentityCredential OK: token_len= expires_on="`. +Before the fix, the same binary printed +`"WorkloadIdentityCredential failed: ... invalid URL, scheme is not http"`. + +## Forward-compatibility note + +This fix depends on Cargo's per-crate-version feature unification +correctly merging the explicit `reqwest 0.13` declaration with +`typespec_client_core`'s declaration. If `typespec_client_core` is ever +bumped to declare reqwest at a different major (e.g. `0.14`), the two +declarations will resolve to *different* reqwest crates and the fix +will silently regress — the test in `tests/native_tls_regression.rs` is +the canary that catches that scenario. + +If a future maintainer upgrades `azure_core` past `0.35`, re-verify the +feature graph by running the `cargo tree -i` commands above and +re-running the regression test before publishing. + +## References + +- `CHANGELOG.md` 0.1.33 entry +- `Cargo.toml:46-72` updated comment block +- `tests/native_tls_regression.rs` regression test +- `.paw/work/fix-native-tls-reqwest/CodeResearch.md` §1–§3 for the + feature-graph trace +- `.paw/work/fix-native-tls-reqwest/ImplementationPlan.md` for the + phased work record +- reqwest 0.13 feature definitions: + https://docs.rs/crate/reqwest/0.13.3/source/Cargo.toml.orig +- typespec_client_core 0.14 reqwest dep: + https://docs.rs/crate/typespec_client_core/0.14.0/source/Cargo.toml.orig diff --git a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md index 0c3d9ad..b39163f 100644 --- a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md +++ b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md @@ -97,7 +97,7 @@ bug fix. ## Phase Status - [x] **Phase 1: Apply Cargo.toml fix, regression test, version bump, CHANGELOG** -- [ ] **Phase 2: Documentation** — `Docs.md` operator-facing note +- [x] **Phase 2: Documentation** — `Docs.md` operator-facing note ## Phase Candidates - [ ] Improve `provider.rs:246` error surfacing so the chained credential From 73d369dcda714d4636444080eaca9aed913a2806 Mon Sep 17 00:00:00 2001 From: "Christopher Krawczyk (SQL) (from Dev Box)" Date: Wed, 13 May 2026 09:45:56 -0700 Subject: [PATCH 6/7] paw: resolve phase candidates - deferred as out of scope Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../ImplementationPlan.md | 26 ++++++++++++++----- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md index b39163f..426b246 100644 --- a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md +++ b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md @@ -100,13 +100,25 @@ bug fix. - [x] **Phase 2: Documentation** — `Docs.md` operator-facing note ## Phase Candidates -- [ ] Improve `provider.rs:246` error surfacing so the chained credential - aggregate (`src/entra.rs:358-366`) is visible at `Display`, not only at - `{:#}` or via `Error::source()` traversal. Today the actual chain - failure is invisible to callers. -- [ ] Add a Workload-Identity smoke test to CI (requires deployed AKS test - resource and a federated identity credential). Closes the test-coverage - gap that allowed this bug to ship. + +Both items below were captured during planning as adjacent improvements +but are explicitly **deferred** as out of scope for this bug-fix release +(0.1.33). They are also called out in `What We're NOT Doing`. Resolution +recorded here; no `- [ ]` items remain. + +- [x] **Deferred:** Improve `provider.rs:246` error surfacing so the + chained credential aggregate (`src/entra.rs:358-366`) is visible at + `Display`, not only at `{:#}` or via `Error::source()` traversal. The + bug being fixed in 0.1.33 made this error chain matter operationally, + but improving error formatting is a separate (and breaking-shape) + change. Recommend filing as a follow-up issue against + microsoft/duroxide-pg. +- [x] **Deferred:** Add a Workload-Identity smoke test to CI. Requires + an AKS test resource and a federated identity credential — too much + scope for a bug-fix patch release. Recommend filing as a follow-up + issue and reusing the + `.paw/work/fix-native-tls-reqwest/Docs.md` § "Live verification" + procedure when that work is taken up. --- From e49ffdd213a0ff235b9e6be95563ef5815146efc Mon Sep 17 00:00:00 2001 From: "Christopher Krawczyk (SQL) (from Dev Box)" Date: Wed, 13 May 2026 09:46:41 -0700 Subject: [PATCH 7/7] Stop tracking PAW artifacts for fix-native-tls-reqwest Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../fix-native-tls-reqwest/CodeResearch.md | 291 ------------------ .paw/work/fix-native-tls-reqwest/Docs.md | 187 ----------- .../ImplementationPlan.md | 275 ----------------- .../fix-native-tls-reqwest/WorkflowContext.md | 41 --- 4 files changed, 794 deletions(-) delete mode 100644 .paw/work/fix-native-tls-reqwest/CodeResearch.md delete mode 100644 .paw/work/fix-native-tls-reqwest/Docs.md delete mode 100644 .paw/work/fix-native-tls-reqwest/ImplementationPlan.md delete mode 100644 .paw/work/fix-native-tls-reqwest/WorkflowContext.md diff --git a/.paw/work/fix-native-tls-reqwest/CodeResearch.md b/.paw/work/fix-native-tls-reqwest/CodeResearch.md deleted file mode 100644 index a8d0f87..0000000 --- a/.paw/work/fix-native-tls-reqwest/CodeResearch.md +++ /dev/null @@ -1,291 +0,0 @@ ---- -date: 2026-05-13T07:50:00-07:00 -git_commit: b7a81df -branch: fix/native-tls-reqwest-feature -repository: microsoft/duroxide-pg -topic: "TLS connector missing from reqwest in the Entra credential chain" -tags: [research, codebase, entra, tls, cargo-features, reqwest, azure-identity] -status: complete -last_updated: 2026-05-13 ---- - -# Research: TLS connector missing from reqwest in the Entra credential chain - -## Research Question - -Document why HTTPS requests issued by the Entra credential chain in -`duroxide-pg` fail at the connector layer with hyper's -`"invalid URL, scheme is not http"` error inside an AKS pod, and capture the -exact dependency-graph state that produces that runtime behavior. - -## Summary - -`duroxide-pg`'s Entra credential chain -(`build_default_chained_credential` in `src/entra.rs:288-301`) delegates to -`azure_identity = 0.35`, whose `WorkloadIdentityCredential` and -`ManagedIdentityCredential` POST to AAD over HTTPS using a reqwest-backed -HTTP transport that `azure_core` constructs internally. In the crate's -current dependency configuration, the resolved `reqwest 0.13.3` build has -**no TLS backend feature enabled** — `default-features = false` flows in -from `typespec_client_core` and no other crate in the graph activates a TLS -feature on `reqwest`. As a result, the compiled reqwest client supports only -plain HTTP. Every HTTPS request the chain makes is rejected by the connector -layer before any network I/O occurs. - -The error surface that callers see is the `.context(...)` wrapper at -`src/provider.rs:246`: `"Entra credential resolution failed: could not -acquire an initial access token"`. The underlying chained-credential -aggregate message (assembled in `src/entra.rs:336-368`) and the -hyper/reqwest connector error are not exposed through `Display`, so the -true cause is invisible without `{:#}` formatting or a tracing subscriber. - -## Documentation System - -- **Framework**: Plain markdown -- **Docs Directory**: `docs/` (deep-dive notes), root for user-facing docs -- **Navigation Config**: N/A (no static-site generator) -- **Style Conventions**: README is the primary user surface; `docs/` holds - operational deep-dives (`docs/entra-ci.md`, `docs/duroxide-blockers.md`, - `docs/concurrent-instance-fetch-race.md`). Rustdoc on the public API - surface uses runnable `rust,no_run` examples under - `# async fn example()` to keep doctests typechecking. -- **Build Command**: `cargo doc --no-deps` for rustdoc -- **Standard Files**: `README.md` (root), `CHANGELOG.md` (root, semver-style - per-version sections, see existing entries through `0.1.32`) - -## Verification Commands - -- **Test Command**: `cargo nextest run` (preferred if installed) or - `cargo test` (fallback). Provider-validation suite: - `cargo nextest run --test postgres_provider_test`. Live-Entra suite is - opt-in via env var (see `tests/entra_live_test.rs:1-30`). -- **Lint Command**: `cargo clippy --all-targets` (per `.github/copilot- - instructions.md` style guidance; no explicit lint script is checked in) -- **Build Command**: `cargo build --release` -- **Type Check**: `cargo check --all-targets` (Rust does typecheck as part - of `cargo build`) -- **Dep-graph verification**: `cargo tree --target x86_64-unknown-linux-gnu - --edges normal -i ` (used below to verify TLS backend presence) - -## Detailed Findings - -### 1. Dependency declarations driving the resolved reqwest build - -`Cargo.toml:46-63` declares the Azure SDK and accompanying dependencies: - -- `Cargo.toml:46-59` — a multi-line comment block describing the intended - FIPS posture. The comment claims: - - That `default-features = false` on both Azure crates suppresses - `reqwest_rustls`. - - That the remaining features route HTTP transport through reqwest's own - `default-tls` feature, "keeping the build on native-tls across all - platforms." - - That `cargo tree --target x86_64-unknown-linux-gnu --all-features - --all-targets` was used to verify no rustls-based TLS crates appear. -- `Cargo.toml:60` — `azure_core = { version = "0.35", default-features = - false, features = ["reqwest", "reqwest_deflate", "reqwest_gzip", - "tokio"] }`. -- `Cargo.toml:61` — `azure_identity = { version = "0.35", default-features - = false, features = ["tokio"] }`. -- `Cargo.toml:62-63` — `futures-util` (unrelated to TLS). - -There is no explicit `reqwest` dependency in `[dependencies]`. - -### 2. Feature-graph trace from `azure_core/reqwest` to the resolved reqwest features - -Read from the registry sources resolved by `Cargo.lock`: - -- `~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/azure_core-0.35.0/Cargo.toml` - - `reqwest = ["typespec_client_core/reqwest"]` - - `reqwest_deflate = ["reqwest", "typespec_client_core/reqwest_deflate"]` - - `reqwest_gzip = ["reqwest", "typespec_client_core/reqwest_gzip"]` - - `reqwest_rustls = ["reqwest", "typespec_client_core/reqwest_rustls"]` - - `azure_core/reqwest` activates only the bare `dep:reqwest` path; no - TLS feature. -- `~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/typespec_client_core-0.14.0/Cargo.toml` - - `[features] reqwest = ["dep:reqwest"]` - - `[features] reqwest_rustls = ["reqwest", "reqwest/rustls"]` - - `[dependencies.reqwest] version = "0.13.2"`, `features = ["stream"]`, - `default-features = false`. - - The `typespec_client_core` crate is the only declared parent of the - optional `reqwest` dep; it pins `default-features = false`. -- `~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/reqwest-0.13.3/Cargo.toml` - - `default = ["default-tls"]` - - `default-tls = ["rustls"]` (in reqwest 0.13.3 the named "default-tls" - feature now resolves to rustls — it does **not** select native-tls). - - `native-tls = ["__native-tls", "__native-tls-alpn"]` - - `__native-tls = ["dep:hyper-tls", "dep:native-tls-crate", "__tls"]` - - `rustls = ["__rustls-aws-lc-rs", "dep:rustls-platform-verifier", - "__rustls"]` - - `__rustls = ["dep:hyper-rustls", "dep:tokio-rustls", "dep:rustls", ...]` - - The optional `hyper-tls` and `hyper-rustls` deps are gated under the - `__native-tls` and `__rustls` feature flags respectively. - -### 3. Resolved dependency tree on the current commit - -Run from a `linux/amd64` `rust:1.91-trixie` container against `b7a81df`: - -- `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i reqwest` - shows `reqwest v0.13.3` activated by `typespec_client_core` only. -- `cargo tree -i rustls` → `error: package ID specification 'rustls' did - not match any packages`. -- `cargo tree -i ring` → `did not match any packages`. -- `cargo tree -i aws-lc-rs` → `did not match any packages`. -- `cargo tree -i hyper-tls` → not in the resolved graph (verified by - absence in the full `cargo tree` output). -- `cargo tree -i hyper-rustls` → not in the resolved graph. -- `native-tls v0.2.18` is present, pulled by sqlx-postgres via - `runtime-tokio-native-tls` (the sqlx path, independent of reqwest). -- `tokio-native-tls v0.3.1` and `hyper-tls v0.6.0` are absent. - -Net resolved reqwest build configuration: `default-features = false`, -features `["stream"]` (from `typespec_client_core`) plus `deflate` and -`gzip` (from `azure_core/reqwest_deflate` and `azure_core/reqwest_gzip` → -`reqwest/deflate` and `reqwest/gzip`). No TLS feature is activated on -reqwest. - -### 4. Entra credential chain construction and usage - -- `src/entra.rs:22` — `use azure_identity::{DeveloperToolsCredential, - ManagedIdentityCredential, WorkloadIdentityCredential};` -- `src/entra.rs:25-30` — the audience constant - `https://ossrdbms-aad.database.windows.net/.default` used for all - default-chain token requests. -- `src/entra.rs:273-301` — - `fn build_default_chained_credential() -> azure_core::Result>`. - - `src/entra.rs:295-297` — - `WorkloadIdentityCredential::new(None)` is pushed onto the chain when - construction succeeds (the constructor returns `Err` when the WI env - vars are absent, so on developer machines it falls through silently). - - `src/entra.rs:298` — - `ManagedIdentityCredential::new(None)?` is pushed unconditionally. - - `src/entra.rs:299` — - `DeveloperToolsCredential::new(None)?` is pushed unconditionally. -- `src/entra.rs:315-368` — `struct ChainedCredential` and its - `TokenCredential` impl: - - `src/entra.rs:343-356` — sequential `get_token` calls per source, - collecting `format!("{name}: {e}")` into `errors`. - - `src/entra.rs:358-366` — final - `azure_core::Error::with_message_fn(ErrorKind::Credential, ...)` - producing the message - `"All chained Entra credentials failed to acquire a token:\n - : - \n - ..."`. - -### 5. Public Entra constructors on `PostgresProvider` - -- `src/provider.rs:232-255` — - `pub(crate) async fn new_with_entra_with_token_source(host, port, - database, user, schema_name, options, token_source, ssl_mode)`: - - `src/provider.rs:243-246` — `token_source.fetch_token(&[audience]) - .await.context("Entra credential resolution failed: could not acquire - an initial access token")?`. The `.context(...)` call wraps the inner - azure_core::Error with an anyhow context layer; only the outer layer - is shown by `Display`, so the chain aggregation from step 4 above is - invisible without `{:#}` formatting. -- The public `connectWithEntra` / `connectWithSchemaAndEntra` constructors - (referenced from the duroxide-node bindings) flow through this internal - method with `EntraTokenSource::default_chain()` as the token source. - -### 6. Runtime behavior observed inside an AKS pod - -Direct execution of `WorkloadIdentityCredential::new(None)` followed by -`get_token(&["https://ossrdbms-aad.database.windows.net/.default"], None)` -inside a chkrawps7 AKS pod (Debian 13 / trixie-slim, glibc 2.41, -libssl3/libcrypto3 present) — using a standalone reproducer compiled -against the local `duroxide-pg` workspace — returns this error tree: - -``` -WorkloadIdentityCredential ERR (Display): - WorkloadIdentityCredential authentication failed. retry policy expired - and the request will no longer be retried - To troubleshoot, visit https://aka.ms/azsdk/rust/identity/troubleshoot#workload - -WorkloadIdentityCredential ERR (Debug): - Error { context: CustomMessage(Custom { kind: Credential, error: - Error { context: CustomMessage(Custom { kind: Connection, error: - Error { context: CustomMessage(Custom { kind: Connection, error: - reqwest::Error { kind: Request, - url: "https://login.microsoftonline.com//oauth2/v2.0/token", - source: hyper_util::client::legacy::Error(Connect, - ConnectError("invalid URL, scheme is not http")) - } - }, "failed to execute `reqwest` request") } - }, "retry policy expired and the request will no longer be retried") } - }, "WorkloadIdentityCredential authentication failed. ...") -``` - -`hyper_util::client::legacy::Error(Connect, ConnectError("invalid URL, -scheme is not http"))` is hyper's identifying error when its connector -has no TLS support compiled in. - -For the same pod and the same audience, `ManagedIdentityCredential` is -able to reach IMDS over plain HTTP (`http://169.254.169.254`) and receives -a structured HTTP 400 BadRequest body ("The requested identity has not -been assigned to this resource") — confirming the runtime can issue HTTP -requests; only HTTPS fails. `DeveloperToolsCredential` fails because the -container image has no `az` / `azd` on PATH (expected). - -The Node-side `@azure/identity` package, run in the same pod against the -same audience, returns a 1952-byte bearer token from `WorkloadIdentity -Credential`, `ManagedIdentityCredential`, and `DefaultAzureCredential` -without modification — establishing that pod-level Workload Identity -plumbing, network egress to `login.microsoftonline.com`, and the federated -token file are all functioning. - -### 7. Existing test coverage of the Entra path - -- `tests/entra_live_test.rs:1-110` — opt-in live test gated on - `DUROXIDE_PG_ENTRA_LIVE_TEST=1`. Exercises Entra against a real Azure - Postgres instance using the developer-tools branch of the chain (relies - on `az login` in the CI environment). The Rust reqwest client is only - reached on the path that succeeds; failure modes of HTTPS calls are not - asserted, and `WorkloadIdentityCredential` is not exercised by any - in-repo test. -- `.github/workflows/entra-live-test.yml:1-113` — the CI workflow for the - above test. Authenticates via `azure/login@v2` (which sets up the Azure - CLI on the runner), so the credential chain resolves through - `DeveloperToolsCredential` rather than `WorkloadIdentityCredential`. - -## Code References - -- `Cargo.toml:46-63` — Azure SDK feature configuration and FIPS-intent - comment block. -- `Cargo.toml:60` — `azure_core` features. -- `Cargo.toml:61` — `azure_identity` features. -- `src/entra.rs:22` — credential imports. -- `src/entra.rs:25-30` — audience constant for PG AAD scope. -- `src/entra.rs:273-301` — default chain construction. -- `src/entra.rs:315-368` — `ChainedCredential` impl and aggregate error - formatting. -- `src/provider.rs:232-255` — `new_with_entra_with_token_source`. -- `src/provider.rs:246` — anyhow `.context(...)` wrapper that hides the - inner chain failure from `Display`. -- `tests/entra_live_test.rs:1-110` — live test (developer-tools branch - only). -- `.github/workflows/entra-live-test.yml:1-113` — CI auth path. - -## Architecture Documentation - -- The crate's stated FIPS-compliance posture is "native-tls/openssl only, - no rustls / ring / aws-lc-rs in the resolved graph" - (`Cargo.toml:48-59`). -- The chained credential design (`src/entra.rs:273-368`) mirrors the - spirit of `DefaultAzureCredential` from azure_identity in higher-level - SDKs but is implemented locally because azure_identity 0.35 does not - ship a `DefaultAzureCredential` aggregator. Sources are stored alongside - a static class-name string, and the first successful credential is - logged once per `ChainedCredential` instance via a `OnceLock` gate - (`src/entra.rs:317`, `src/entra.rs:346-352`). -- The Entra refresh-task panic guard relies on `futures-util`'s - `FutureExt::catch_unwind` (`Cargo.toml:62-63`). -- Provider-side error surfacing flows everything through `anyhow::Error` - with `.context(...)` wrappers (`src/provider.rs:246`). This loses the - underlying source chain at `Display`; only `{:#}` formatting or - `Error::source()` traversal surfaces the inner error. - -## Open Questions - -None blocking. The dependency graph, runtime behavior, and code paths are -all observable from the local checkout and the registry sources resolved -by `Cargo.lock`. diff --git a/.paw/work/fix-native-tls-reqwest/Docs.md b/.paw/work/fix-native-tls-reqwest/Docs.md deleted file mode 100644 index 279bdc1..0000000 --- a/.paw/work/fix-native-tls-reqwest/Docs.md +++ /dev/null @@ -1,187 +0,0 @@ -# Docs: native-tls reqwest feature fix (duroxide-pg 0.1.33) - -## Symptom - -Every HTTPS request issued by the Microsoft Entra credential chain in -`duroxide-pg 0.1.31`–`0.1.32` failed at hyper's connector layer with: - -``` -reqwest::Error { - url: "https://login.microsoftonline.com/.../oauth2/v2.0/token", - source: hyper_util::client::legacy::Error( - Connect, - ConnectError("invalid URL, scheme is not http") - ) -} -``` - -The error originates at the connector layer — **before any DNS lookup or -TCP connect attempt**. The substring `"invalid URL, scheme is not http"` -is hyper's signature for "no TLS connector compiled into the HTTP client". - -Affected callers: `WorkloadIdentityCredential` (AKS Workload Identity), -`ManagedIdentityCredential` when the IMDS endpoint required HTTPS, and -the `ClientAssertionCredential` token endpoint used by federated -identity flows. `DeveloperToolsCredential` (the `az login` branch) was -unaffected because it shells out to the Azure CLI binary rather than -calling AAD over reqwest. - -Operational impact: `PostgresProvider::new_with_entra` and -`new_with_schema_and_entra` were unreachable in any production topology -that relied on Workload Identity — passwordless Entra was effectively -broken on AKS, Container Apps, and any other host that does not have the -Azure CLI installed locally. - -## Root cause - -The crate's `Cargo.toml` declared the Azure SDK dependencies as: - -```toml -azure_core = { version = "0.35", default-features = false, - features = ["reqwest", "reqwest_deflate", "reqwest_gzip", "tokio"] } -azure_identity = { version = "0.35", default-features = false, - features = ["tokio"] } -``` - -The intent — captured in the original comment block above those lines — -was to preserve a native-tls / openssl-only TLS posture for FIPS -compliance while still enabling reqwest as the HTTP transport. The -implementation was incorrect on two grounds: - -**1. `azure_core/reqwest` does not enable any reqwest defaults.** - -Tracing the Cargo feature graph for `azure_core 0.35`: - -- `azure_core/reqwest = ["typespec_client_core/reqwest"]` -- `typespec_client_core/reqwest = ["dep:reqwest"]` (activates the - optional dep only — no features) -- `typespec_client_core 0.14.0` declares - `reqwest = { version = "0.13", default-features = false, features = ["stream"] }` - -The resolved reqwest crate is built with `default-features = false` -plus `stream` only. No TLS backend is compiled in. The constructor at -`typespec_client_core::http::clients::reqwest::new_reqwest_client` -(used by `azure_core::http::new_http_client`) calls -`reqwest::ClientBuilder::new().build()` without configuring TLS -explicitly, so it inherits the crate's compile-time feature flags. With -no TLS feature active, the resulting hyper-util connector rejects every -HTTPS URL at request time. - -**2. `reqwest 0.13`'s `default-tls` is rustls, not native-tls.** - -The previous Cargo.toml comment claimed the configuration "routes through -reqwest's own `default-tls` feature, keeping the build on native-tls". -Even if `default-tls` had been active, it would NOT have routed through -native-tls — `default-tls` in reqwest 0.13.3 resolves to `rustls`. The -feature name was kept for compatibility but its meaning changed in the -reqwest 0.12 → 0.13 transition. - -## Why CI didn't catch it - -The only existing live test for the Entra path, -`tests/entra_live_test.rs`, authenticates via -`DeveloperToolsCredential` (`az login` on the CI runner). That -credential variant invokes the Azure CLI binary as a child process and -never traverses the reqwest-backed HTTP transport. The Workload Identity -branch of the credential chain has no live test in this repo, so the -missing TLS backend was invisible to CI. - -## Fix - -A single line added to `[dependencies]` in `Cargo.toml`: - -```toml -reqwest = { version = "0.13", default-features = false, features = ["native-tls"] } -``` - -Cargo's feature unification merges this declaration with -`typespec_client_core`'s reqwest dep (same major.minor version range, -same resolved 0.13.3) and the union of features — -`["stream", "native-tls"]` — is applied to the single resolved reqwest -crate. The native-tls feature activates `hyper-tls` and the -`native-tls-crate` (openssl-backed on Linux), giving hyper a working -TLS connector for HTTPS URLs. - -The accompanying comment block in `Cargo.toml:46-72` is updated to -describe the actual feature graph and the rationale for the explicit -dep. - -No source code changes are required. The fix is purely a build- -configuration adjustment. - -## Verification - -### Automated - -``` -cargo build --release -cargo test --release --test native_tls_regression -cargo clippy --all-targets -``` - -The new test at `tests/native_tls_regression.rs` constructs the -`azure_core::http` client via the same `new_http_client(None)` -constructor that `azure_identity` uses internally, then issues a GET -to `https://127.0.0.1:1/`. The request must fail (port 1 is closed) but -the failure must be a *network* error, not a *connector* error. The -regression contract asserts the error chain does NOT contain the -substring `"scheme is not http"` — that substring is the signature of -the bug being fixed. - -### Dependency graph - -``` -cargo tree --target x86_64-unknown-linux-gnu --edges normal -i rustls -cargo tree --target x86_64-unknown-linux-gnu --edges normal -i ring -cargo tree --target x86_64-unknown-linux-gnu --edges normal -i aws-lc-rs -cargo tree --target x86_64-unknown-linux-gnu --edges normal -i hyper-rustls -cargo tree --target x86_64-unknown-linux-gnu --edges normal -i tokio-rustls -``` - -Each command must return `"package ID specification ... did not match -any packages"` — the FIPS-compliance posture (no rustls-based TLS) is -preserved. - -``` -cargo tree --target x86_64-unknown-linux-gnu --edges normal -i hyper-tls -``` - -Must show `hyper-tls v0.6.0` activated by `reqwest`. - -### Live verification (manual) - -Build a Linux x86_64 binary of the diagnostic reproducer (kept locally -outside the repo at `examples/entra_debug.rs` during diagnosis) against -the patched crate, then execute it inside an AKS pod with Workload -Identity configured. The output must include -`"WorkloadIdentityCredential OK: token_len= expires_on="`. -Before the fix, the same binary printed -`"WorkloadIdentityCredential failed: ... invalid URL, scheme is not http"`. - -## Forward-compatibility note - -This fix depends on Cargo's per-crate-version feature unification -correctly merging the explicit `reqwest 0.13` declaration with -`typespec_client_core`'s declaration. If `typespec_client_core` is ever -bumped to declare reqwest at a different major (e.g. `0.14`), the two -declarations will resolve to *different* reqwest crates and the fix -will silently regress — the test in `tests/native_tls_regression.rs` is -the canary that catches that scenario. - -If a future maintainer upgrades `azure_core` past `0.35`, re-verify the -feature graph by running the `cargo tree -i` commands above and -re-running the regression test before publishing. - -## References - -- `CHANGELOG.md` 0.1.33 entry -- `Cargo.toml:46-72` updated comment block -- `tests/native_tls_regression.rs` regression test -- `.paw/work/fix-native-tls-reqwest/CodeResearch.md` §1–§3 for the - feature-graph trace -- `.paw/work/fix-native-tls-reqwest/ImplementationPlan.md` for the - phased work record -- reqwest 0.13 feature definitions: - https://docs.rs/crate/reqwest/0.13.3/source/Cargo.toml.orig -- typespec_client_core 0.14 reqwest dep: - https://docs.rs/crate/typespec_client_core/0.14.0/source/Cargo.toml.orig diff --git a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md b/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md deleted file mode 100644 index 426b246..0000000 --- a/.paw/work/fix-native-tls-reqwest/ImplementationPlan.md +++ /dev/null @@ -1,275 +0,0 @@ -# Fix Native TLS Reqwest Implementation Plan - -## Overview - -Restore HTTPS capability to the Entra credential chain in `duroxide-pg` by -activating a TLS backend on the resolved `reqwest 0.13` build. The current -dependency graph leaves reqwest with no TLS connector, which causes every -HTTPS request issued by `WorkloadIdentityCredential`, `ManagedIdentity -Credential` (when going through AAD over HTTPS), and the -`ClientAssertionCredential` token endpoint to fail with hyper's -`"invalid URL, scheme is not http"` error at the connector layer — before -any network I/O occurs (`CodeResearch.md` §1, §3, §6). - -The fix is a single-line `Cargo.toml` change that adds an explicit -`reqwest` dependency with `features = ["native-tls"]`. Cargo feature -unification activates the native-tls (openssl-backed) connector on the -resolved reqwest build without disturbing any source code. Verified via -`cargo tree` that no `rustls`, `ring`, or `aws-lc-rs` crates are introduced, -preserving the crate's stated FIPS-compliance posture (`CodeResearch.md` -§3). - -## Current State Analysis - -- `Cargo.toml:60-61` declares `azure_core` and `azure_identity` with - `default-features = false` and a hand-picked feature list intended to - preserve native-tls. The accompanying comment block at `Cargo.toml:46-59` - states that "this routes through reqwest's own `default-tls` feature". -- The intent is incorrect on two grounds (`CodeResearch.md` §2): - 1. `azure_core/reqwest = ["typespec_client_core/reqwest"]` and - `typespec_client_core/reqwest = ["dep:reqwest"]` activate only the - bare optional reqwest dep; neither flag enables a TLS feature. - `typespec_client_core` declares the reqwest dep with - `default-features = false`, so reqwest's own defaults are suppressed. - 2. In `reqwest 0.13.3`, the `default-tls` feature resolves to **rustls**, - not native-tls — the name no longer means what it did in earlier - reqwest releases. So even if defaults were active, they would - contradict the stated FIPS posture. -- The Entra credential chain (`src/entra.rs:288-301`) builds correctly, - but every HTTPS call inside the chain's `get_token` implementation - fails at the reqwest connector. The aggregate failure message is - obscured by the `.context(...)` wrapper at `src/provider.rs:246`, so - callers see only `"Entra credential resolution failed: could not - acquire an initial access token"` without the underlying cause. -- Existing test coverage (`tests/entra_live_test.rs`, `.github/workflows/ - entra-live-test.yml`) authenticates via the Azure CLI (`az login`) on - the runner, which exercises `DeveloperToolsCredential`. The reqwest - HTTP transport on the WI / MI branches is never traversed by CI, which - is why the missing TLS feature was not caught upstream. - -## Desired End State - -`Cargo.toml` declares an explicit reqwest dependency with -`default-features = false, features = ["native-tls"]`. The resolved -dependency graph activates `hyper-tls`, `tokio-native-tls`, and -`native-tls-crate` (openssl-backed) without introducing `rustls`, -`hyper-rustls`, `tokio-rustls`, `ring`, or `aws-lc-rs`. The accompanying -comment block in `Cargo.toml` is updated to describe the actual feature -graph rather than the previous mistaken claim. - -A regression test asserts that the reqwest-backed HTTP transport used by -`azure_core` accepts an `https://` URL at the connector layer (i.e. does -not return the `"invalid URL, scheme is not http"` connect error). The -test runs offline and does not depend on Azure credentials. - -The fix is verifiable by: -- `cargo build --release` succeeds. -- `cargo nextest run` (or `cargo test`) passes including the new - regression test. -- `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i rustls` - / `-i ring` / `-i aws-lc-rs` each return "did not match any packages". -- `cargo tree -i hyper-tls` shows `hyper-tls v0.6.0` activated from - reqwest. -- Manual: a linux/amd64 binary built from `examples/entra_debug.rs` (the - reproducer used during diagnosis) executed inside the chkrawps7 AKS pod - shows `WorkloadIdentityCredential` returns a valid access token. - -The crate version is bumped to `0.1.33` and `CHANGELOG.md` documents the -bug fix. - -## What We're NOT Doing - -- Not modifying the Entra chain composition (`src/entra.rs:288-301`) — the - bug is purely a build-configuration issue, not a code defect. -- Not changing the `.context(...)` error-wrapping behavior at - `src/provider.rs:246`, even though it hides the underlying chain - failure. Improving error surfacing is a separate, larger concern; see - also the upstream comment hidden behind the wrapper. Tracked as a - candidate for follow-up. -- Not adding a Workload-Identity live CI test. The reproducer - (`examples/entra_debug.rs`) is preserved for local diagnosis but a full - AKS-in-CI WI runner is out of scope for a bug-fix release. -- Not changing the sqlx TLS path (`runtime-tokio-native-tls`) — that path - is unaffected; only the reqwest path used by `azure_core` is broken. -- Not bumping `azure_core` / `azure_identity` versions. The fix is - agnostic to those crate versions and preserves the pinned `0.35` - versions. - -## Phase Status -- [x] **Phase 1: Apply Cargo.toml fix, regression test, version bump, CHANGELOG** -- [x] **Phase 2: Documentation** — `Docs.md` operator-facing note - -## Phase Candidates - -Both items below were captured during planning as adjacent improvements -but are explicitly **deferred** as out of scope for this bug-fix release -(0.1.33). They are also called out in `What We're NOT Doing`. Resolution -recorded here; no `- [ ]` items remain. - -- [x] **Deferred:** Improve `provider.rs:246` error surfacing so the - chained credential aggregate (`src/entra.rs:358-366`) is visible at - `Display`, not only at `{:#}` or via `Error::source()` traversal. The - bug being fixed in 0.1.33 made this error chain matter operationally, - but improving error formatting is a separate (and breaking-shape) - change. Recommend filing as a follow-up issue against - microsoft/duroxide-pg. -- [x] **Deferred:** Add a Workload-Identity smoke test to CI. Requires - an AKS test resource and a federated identity credential — too much - scope for a bug-fix patch release. Recommend filing as a follow-up - issue and reusing the - `.paw/work/fix-native-tls-reqwest/Docs.md` § "Live verification" - procedure when that work is taken up. - ---- - -## Phase 1: Apply Cargo.toml fix, regression test, version bump, CHANGELOG - -### Objective - -Activate the native-tls connector on the resolved reqwest build through a -single `Cargo.toml` change. Add an offline regression test that exercises -the reqwest-backed HTTP transport's HTTPS connector. Bump the crate version -and add a `CHANGELOG.md` entry. - -### Changes Required - -- **`Cargo.toml`**: - - Add a new line to `[dependencies]` (adjacent to the `azure_core` / - `azure_identity` block) declaring - `reqwest = { version = "0.13", default-features = false, features = ["native-tls"] }`. - Cargo feature unification will activate the native-tls path on the - reqwest already pulled in by `typespec_client_core`. - - Update the comment block at `Cargo.toml:46-59` to: - - Drop the incorrect claim that "this routes through reqwest's own - `default-tls` feature". - - Explain that `typespec_client_core` declares reqwest with - `default-features = false`, so an explicit reqwest dep is required - to opt into a TLS backend. - - Note that `reqwest 0.13.3`'s `default-tls` resolves to rustls, so - the explicit `native-tls` feature is the correct way to preserve - the openssl posture. - - Reference the verification: `cargo tree -i rustls / ring / aws-lc-rs` - should all report "did not match any packages". - - Bump `package.version` from `0.1.32` to `0.1.33`. - -- **`tests/native_tls_regression.rs`** (new): a single - `#[tokio::test(flavor = "current_thread")]` that: - - Builds a `reqwest::Client` via `azure_core::http::new_http_client(None)` - (or, if that constructor signature is unstable across the 0.35 minor - range, builds via `reqwest::ClientBuilder::new().build()` and reaches - the same connector code path). - - Issues a single request to `https://127.0.0.1:1/` (a port chosen so the - request fails fast at TCP-connect time). - - Asserts the returned `reqwest::Error` is a *connection* failure (e.g. - `ConnectionRefused`, `Os { code: ECONNREFUSED, .. }`, or a generic - connect error whose `Display` does NOT contain the substring - `"scheme is not http"`. The negative assertion is the regression - contract: presence of that substring indicates the connector was - rejected before any network attempt, which is the symptom this fix - addresses. - - Includes a documentation comment explaining the regression scenario - and linking to `CHANGELOG.md` for the 0.1.33 entry. - - The test is offline, deterministic, fast (single connect attempt), - and self-contained — it does not require Azure, federated tokens, or - network egress to login.microsoftonline.com. - -- **`CHANGELOG.md`**: Add a new entry at the top, above the existing - `## 0.1.32 ...` section, following the existing convention. Suggested - content (one-line summary + a paragraph describing the symptom, root - cause, and fix): - - **Title**: `## 0.1.33 — (bug fix: native-tls reqwest feature)` - - **Section**: `### Fixed` with a bullet covering the broken-HTTPS - symptom, the missing `native-tls` feature on the resolved reqwest - build, and a one-line note that `WorkloadIdentityCredential` / - `ManagedIdentityCredential` (HTTPS-bound) chains were unreachable - before this fix. - -- **Tests**: `tests/native_tls_regression.rs` (above). No changes to - `tests/entra_live_test.rs` — that test exercises the developer-tools - branch and is unaffected by this fix. - -### Success Criteria - -#### Automated Verification - -- [ ] `cargo build --release` succeeds. -- [ ] `cargo nextest run` (or `cargo test`) passes the full suite, - including the new `tests/native_tls_regression.rs`. -- [ ] `cargo clippy --all-targets -- -D warnings` is clean. -- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i - rustls` returns "did not match any packages". -- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i - ring` returns "did not match any packages". -- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i - aws-lc-rs` returns "did not match any packages". -- [ ] `cargo tree --target x86_64-unknown-linux-gnu --edges normal -i - hyper-tls` shows `hyper-tls v0.6.0` activated by reqwest. - -#### Manual Verification - -- [ ] Build the existing reproducer - (`examples/entra_debug.rs`, preserved in the user's local checkout - outside `.paw/`) for `x86_64-unknown-linux-gnu` against the patched - crate. The binary, when executed inside the chkrawps7 pod, prints - `"WorkloadIdentityCredential OK: token_len= expires_on=..."` — i.e. - a valid bearer token is acquired. Before this fix the same binary - prints `"invalid URL, scheme is not http"` for WorkloadIdentity - Credential. -- [ ] `Cargo.lock` updates reflect the dep-graph additions - (`hyper-tls`, `tokio-native-tls`, `rustls-pki-types`) and the absence - of `rustls`, `hyper-rustls`, `tokio-rustls`, `ring`, `aws-lc-rs`. - ---- - -## Phase 2: Documentation - -### Objective - -Capture the as-built record for this fix in `Docs.md` for operator-facing -reference. No user-facing API changes — `README.md` is unaffected. - -### Changes Required - -- **`.paw/work/fix-native-tls-reqwest/Docs.md`**: Operator-facing note - capturing: - - One-paragraph symptom description (HTTPS calls from the Entra chain - failing with `"invalid URL, scheme is not http"`). - - The dependency-graph trace showing why reqwest was built without a TLS - backend (`azure_core/reqwest` → `typespec_client_core/reqwest` → - `dep:reqwest` with `default-features = false`). - - The fix (explicit `reqwest` dep with `features = ["native-tls"]`) and - the rationale (preserves the openssl posture; `reqwest 0.13.3`'s - `default-tls` is rustls, so it cannot be used as the workaround). - - Verification commands (the `cargo tree -i ...` invocations from the - success criteria). - - Cross-references to `CHANGELOG.md` 0.1.33 and the upstream PR. - - Implementer should load `paw-docs-guidance` at the start of this phase - for the Docs.md template and any cross-cutting conventions. - -- **`README.md`**: No changes. The bug fix is invisible at the public API - surface; the Entra section already documents the intended posture and - remains accurate post-fix. - -### Success Criteria - -#### Automated Verification - -- [ ] No new doc-build infrastructure exists; running `cargo doc --no-deps` - succeeds without warnings introduced by this work. - -#### Manual Verification - -- [ ] `Docs.md` is internally consistent with `CHANGELOG.md` 0.1.33 and - `Cargo.toml`'s updated comment block. -- [ ] All file:line references in `Docs.md` resolve to the corresponding - source locations on the head of `fix/native-tls-reqwest-feature`. - ---- - -## References - -- Issue: none (filed as bug observed during PilotSwarm AKS deploy; see - `Initial Prompt` in `.paw/work/fix-native-tls-reqwest/WorkflowContext.md`) -- Spec: not produced (minimal workflow) -- Research: `.paw/work/fix-native-tls-reqwest/CodeResearch.md` diff --git a/.paw/work/fix-native-tls-reqwest/WorkflowContext.md b/.paw/work/fix-native-tls-reqwest/WorkflowContext.md deleted file mode 100644 index 40ce46a..0000000 --- a/.paw/work/fix-native-tls-reqwest/WorkflowContext.md +++ /dev/null @@ -1,41 +0,0 @@ -# WorkflowContext - -Work Title: Fix Native TLS Reqwest -Work ID: fix-native-tls-reqwest -Base Branch: main -Target Branch: fix/native-tls-reqwest-feature -Execution Mode: current-checkout -Repository Identity: github.com/microsoft/duroxide-pg@dbf922cdd64883871920019650e338a4f5af5ae1 -Execution Binding: none -Workflow Mode: minimal -Review Strategy: local -Review Policy: final-pr-only -Session Policy: continuous -Final Agent Review: disabled -Final Review Mode: multi-model -Final Review Interactive: smart -Final Review Models: latest GPT, latest Gemini, latest Claude Opus -Final Review Specialists: all -Final Review Interaction Mode: parallel -Final Review Specialist Models: none -Final Review Perspectives: auto -Final Review Perspective Cap: 2 -Implementation Model: none -Plan Generation Mode: single-model -Plan Generation Models: latest GPT, latest Gemini, latest Claude Opus -Planning Docs Review: disabled -Planning Review Mode: multi-model -Planning Review Interactive: smart -Planning Review Models: latest GPT, latest Gemini, latest Claude Opus -Planning Review Specialists: all -Planning Review Interaction Mode: parallel -Planning Review Specialist Models: none -Planning Review Perspectives: auto -Planning Review Perspective Cap: 2 -Custom Workflow Instructions: none -Initial Prompt: Add native-tls feature to reqwest dep so the Rust HTTP client has a TLS connector. Without this, every HTTPS call (incl. WorkloadIdentityCredential and AAD token endpoint) fails with hyper's 'invalid URL, scheme is not http' — making the Entra credential chain unusable on AKS pods. Reproduced E2E in PilotSwarm AKS deploy; verified fix preserves native-tls/openssl posture (no rustls/ring/aws-lc-rs added). -Issue URL: none -Remote: origin -Artifact Lifecycle: commit-and-clean -Artifact Paths: auto-derived -Additional Inputs: none