Fix event order and callback calling during container creation#15503
Fix event order and callback calling during container creation#15503
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 15503Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 15503" |
|
@afscrome please share your feedback too--I tried to add you as a reviewer, but this new repository has contributor caching issues that prevents me from doing that 😞 |
There was a problem hiding this comment.
Pull request overview
Fixes regressions in Aspire Hosting/DCP startup sequencing introduced by container tunnel changes, ensuring environment/args callbacks run once, run after BeforeResourceStarted, and are isolated on failure.
Changes:
- Adds callback result caching for env/args annotations and resets cache on resource restart.
- Refactors DCP container creation to coordinate tunnel setup with container creation and adjusts lifecycle event publishing (including new connection-string event).
- Updates/extends tests to validate callback invocation order/count and removes quarantine from the proxyless tunnel test via shorter test resource names.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/Aspire.Hosting.Tests/Dcp/DcpExecutorTests.cs | Adds coverage for callback invocation ordering/count, failure isolation, and mixed tunnel-dependent/independent container startup. |
| tests/Aspire.Hosting.Tests/ContainerTunnelTests.cs | Un-quarantines proxyless tunnel test and shortens test name to avoid invalid resource name length. |
| src/Aspire.Hosting/Orchestrator/ApplicationOrchestrator.cs | Subscribes to and publishes connection-string availability separately from resource-starting. |
| src/Aspire.Hosting/Dcp/Model/ContainerTunnel.cs | Adds UpdatedTunnels state to tunnel proxy resource (non-serialized). |
| src/Aspire.Hosting/Dcp/DcpNameGenerator.cs | Adds stable service-name caching to prevent duplicate service creation for the same endpoint/network. |
| src/Aspire.Hosting/Dcp/DcpExecutorEvents.cs | Introduces OnConnectionStringAvailableContext. |
| src/Aspire.Hosting/Dcp/DcpExecutor.cs | Major refactor: container creation orchestration, tunnel setup coordination, endpoint-advertisement de-dupe, callback cache reset on restart. |
| src/Aspire.Hosting/Dcp/ContainerCreationContext.cs | New coordination primitives for container creation vs tunnel creation. |
| src/Aspire.Hosting/ApplicationModel/ResourceExtensions.cs | Adds optional caching behavior when discovering dependencies by evaluating annotation callbacks once. |
| src/Aspire.Hosting/ApplicationModel/ResourceDependencyDiscoveryMode.cs | Converts discovery mode to [Flags] and adds CacheAnnotationCallbackResults. |
| src/Aspire.Hosting/ApplicationModel/ICallbackResourceAnnotation.cs | New internal interface for “evaluate once + forget cache” callback annotations. |
| src/Aspire.Hosting/ApplicationModel/EnvironmentVariablesConfigurationGatherer.cs | Switches to cached evaluation for environment callbacks. |
| src/Aspire.Hosting/ApplicationModel/EnvironmentCallbackAnnotation.cs | Implements cached evaluation for env callbacks. |
| src/Aspire.Hosting/ApplicationModel/EndpointAnnotation.cs | Adds an internal unique endpoint ID. |
| src/Aspire.Hosting/ApplicationModel/CommandLineArgsCallbackAnnotation.cs | Implements cached evaluation for args callbacks. |
| src/Aspire.Hosting/ApplicationModel/ArgumentsExecutionConfigurationGatherer.cs | Switches to cached evaluation for args callbacks. |
src/Aspire.Hosting/ApplicationModel/CommandLineArgsCallbackAnnotation.cs
Outdated
Show resolved
Hide resolved
JamesNK
left a comment
There was a problem hiding this comment.
The overall approach — caching callback results via EvaluateOnceAsync, coordinating container creation via ContainerCreationContext, and splitting the event ordering so BeforeResourceStarted fires before callbacks — looks sound and addresses the issues in #14954 and #15358. The new tests are well-structured.
However, I found a few problems that should be addressed:
-
Bug: Custom certificate bundles regression —
BuildContainerConfigurationdropped theCustomBundlesFactoriesiteration that existed in the oldCreateContainerAsync. Containers needing custom cert bundles (e.g., Java keystores) will silently break. -
Thread safety:
_appResourcesconcurrent mutation — The tunnel creation task mutates_appResources(viaAddRange/Add) concurrently with non-tunnel containers enumerating it.List<T>is not thread-safe for this. -
Faulted tasks cached permanently — If a callback throws during initial startup, the faulted
Taskis stored and never cleared (onlyForgetCachedResulton restart clears it), making the failure permanent. -
Public API concern —
CacheAnnotationCallbackResultsleaks an internal concern into the publicResourceDependencyDiscoveryModeenum.Recursive = 0in a[Flags]enum is also a footgun. -
Minor:
CountdownEventnot disposed,EndpointIDdead code.
| /// on subsequent evaluations of the same annotation, rather than re-evaluating the callback each time. | ||
| /// </summary> | ||
| CacheAnnotationCallbackResults = 2 | ||
| } |
There was a problem hiding this comment.
Recursive = 0 in a [Flags] enum is error-prone
With Recursive = 0, anyValue.HasFlag(Recursive) always returns true. The code currently avoids this by checking !mode.HasFlag(DirectOnly), but the zero-value flags member is a trap for future callers.
Also, CacheAnnotationCallbackResults mixes an internal implementation detail (callback memoization) into what is a public API about dependency discovery modes. This flag is only consumed internally by DcpExecutor.GetHostDependenciesAsync. Consider making it a separate internal parameter instead.
There was a problem hiding this comment.
Should CacheAnnotationCallbackResults be public? Will people other than us ever use it?
If not, you could remove it from the enum (and make it no longer flags) and have an extra bool on internal GetDependenciesAsync to control the behavior.
There was a problem hiding this comment.
Yes, you are right, I was too clever here.
I decided to keep CacheAnnotationCallbackResults public. I think it will be useful for Aspire users. Main reason, if somebody wants to call GetResourceDependenciesAsync() during application startup sequence, it will work and won't break "call the callbacks once" promise as long as this flag is set.
There was a problem hiding this comment.
I'm still not a fan of making the enum a flags enum. DirectOnly and Recursive are opposites of each other. What happens when both are set?
Should the extra flag be a new parameter instead of adding to the mode enum and making it a flags enum?
There was a problem hiding this comment.
@JamesNK You get an ArgumentException if you use DirectOnly and Recursive at the same time; I added that check after your initial review.
There are many examples of [Flags] enumerations in .NET standard library where certain combinations of flags are not allowed. For example, Globalization.NumberStyles, Globalization.DateTimeStyles, RegularExpressions.RegexOptions, Threading.Tasks.TaskContinuationOptions. Not a best practice by any means, but not super uncommon either. They all throw ArgumentException or ArgumentOutOfRangeException if invalid flag comibination is used.
In our case I think the benefits of having a single mode parameter outweight the drawback of having additional invocation parameter. Also, the current implementation allows us to easily accommodate a new primary mode that is not direct and not fully recursive, should a need for such thing arise in future.
|
@JamesNK thanks for the review! I belive I addressed all your comments. |
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
|
🎬 CLI E2E Test Recordings — 49 recordings uploaded (commit View recordings
📹 Recordings uploaded automatically from CI run #23512293604 |
|
Please do not merge this yet; I think there is one more issue here that I need to fix. UPDATE: false alarm--a script that I was working on "independently" messed up with the test machine. |
Description
This change fixes issues related to environment and invocation argument callback calling and application model events introduced by #14557. It ensures that
Fixes:
#14954
#15358
Checklist
<remarks />and<code />elements on your triple slash comments?aspire.devissue: