CSHARP-5769: Implement hasAncestor, hasRoot, and returnScope for Atlas Search#1933
CSHARP-5769: Implement hasAncestor, hasRoot, and returnScope for Atlas Search#1933ajcvickers wants to merge 3 commits intomongodb:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends the driver’s Atlas Search support by adding hasAncestor/hasRoot operators (for embedded document queries) and returnScope support for $search / $searchMeta, including option cloning to prevent mutating caller-provided SearchOptions when returnScope is set.
Changes:
- Add
HasAncestorandHasRootsearch operators and rendering support. - Add
returnScopeoverloads for$searchand$searchMetaacross pipeline + fluent APIs. - Introduce
Clone()onSearchOptionsand nested option classes; add/extend Atlas Search tests and utilities.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/MongoDB.Driver.Tests/Search/VectorSearchTests.cs | Update test client creation to match new tuple-returning helper. |
| tests/MongoDB.Driver.Tests/Search/SearchDefinitionBuilderTests.cs | Add rendering tests for HasAncestor / HasRoot. |
| tests/MongoDB.Driver.Tests/Search/AutoEmbedVectorSearchTests.cs | Update test client creation to match new tuple-returning helper. |
| tests/MongoDB.Driver.Tests/Search/AtlasSearchTestsUtils.cs | Return (IMongoClient, EventCapturer) and configure command capture for aggregates. |
| tests/MongoDB.Driver.Tests/Search/AtlasSearchTests.cs | Add integration tests for HasAncestor/HasRoot in embeddedDocument and for returnScope in $search/$searchMeta; add test data/index setup and captured-stage validation. |
| src/MongoDB.Driver/Search/SearchTrackingOptions.cs | Add Clone() to support safe option duplication. |
| src/MongoDB.Driver/Search/SearchOptions.cs | Add Clone() and clone nested option objects. |
| src/MongoDB.Driver/Search/SearchHighlightOptions.cs | Add Clone() for highlight options. |
| src/MongoDB.Driver/Search/SearchDefinitionBuilder.cs | Add HasAncestor / HasRoot builder methods. |
| src/MongoDB.Driver/Search/SearchDefinition.cs | Extend operator type enum with HasAncestor / HasRoot. |
| src/MongoDB.Driver/Search/SearchCountOptions.cs | Add Clone() for count options. |
| src/MongoDB.Driver/Search/OperatorSearchDefinitions.cs | Implement HasAncestorSearchDefinition / HasRootSearchDefinition rendering (incl. prefix reset behavior). |
| src/MongoDB.Driver/PipelineStageDefinitionBuilder.cs | Add $search / $searchMeta overloads with returnScope and option cloning/forced stored source. |
| src/MongoDB.Driver/PipelineDefinitionBuilder.cs | Add pipeline extension overloads for returnScope search/searchMeta. |
| src/MongoDB.Driver/IAggregateFluent.cs | Add fluent overloads for returnScope search/searchMeta. |
| src/MongoDB.Driver/AggregateFluentBase.cs | Add virtual stubs + expression convenience overloads for returnScope. |
| src/MongoDB.Driver/AggregateFluent.cs | Implement new fluent overloads delegating to pipeline extensions. |
| Agents.md | Add repo/stack/commands/env-var guidance for agents. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Creates a clone of the options. | ||
| /// </summary> | ||
| /// <returns>A clone of the options.</returns> | ||
| public SearchTrackingOptions Clone() => new() { SearchTerms = SearchTerms }; |
There was a problem hiding this comment.
SearchTrackingOptions.Clone() sets SearchTerms via the public setter. If SearchTerms was never set (it can be null today and Render() will simply omit it), the setter will throw due to Ensure.IsNotNullOrEmpty. This makes cloning (and anything that clones SearchOptions.Tracking) fail unexpectedly. Consider cloning the backing field directly or only assigning SearchTerms when it is non-null/non-empty so Clone() preserves current semantics.
| public SearchTrackingOptions Clone() => new() { SearchTerms = SearchTerms }; | |
| public SearchTrackingOptions Clone() => new() { _searchTerms = _searchTerms }; |
| /// </summary> | ||
| /// <typeparam name="TField">The type of the ancestor documents.</typeparam> | ||
| /// <param name="path">The path from the root to the ancestor.</param> | ||
| /// <param name="operator">The operator to execute at the root.</param> |
There was a problem hiding this comment.
The XML docs for HasAncestor say the inner operator is executed "at the root", but this operator is executed in the ancestor context (while HasRoot is the one that moves to the root). Please update the param description to avoid misleading API consumers.
| /// <param name="operator">The operator to execute at the root.</param> | |
| /// <param name="operator">The operator to execute in the ancestor context specified by <paramref name="path"/>.</param> |
| private readonly IMongoClient _mongoClient; | ||
| private readonly EventCapturer _eventCapturer; | ||
| private static bool __indexesCreated; | ||
| private static readonly string __databaseUniquifier = Guid.NewGuid().ToString(); |
There was a problem hiding this comment.
Why do we need that unique suffix in the database name? Is this cluster shared and permanent? If so - should we drop the generated collection when tests are done?
There was a problem hiding this comment.
I'm trying to move in the direction of proper shared resource management, without any needed pre-configuration. The current code isn't very clean in this respect because it's not currently the way these tests are set up. For now, I can make sure that the database is deleted after all the tests that use it have run. (I generally don't like this approach for a few reasons, but it would be okay here for now.) The database name needs to be unique because if the same tests are immediately run again (e.g. Run until Fail in Rider), then the server can get confused between the old and the new database--we saw this a lot on EF tests.
There was a problem hiding this comment.
Added a temporary cleanup mechanism. Claude says:
- Countdown cleanup pattern: The __returnScopeTestCount = 12 counter, decremented in try/finally blocks, and checked in DisposeInternal, is a workable but fragile approach:
- The hardcoded 12 must be kept in sync with the test count manually.
- The decrement/disposal comments (e.g., "3 & 4") correctly account for [Theory] running twice.
- No thread-safety (non-volatile, no Interlocked), but fine for sequential test execution.
- The DisposeInternal check on the counter is sound: the finally blocks run before DisposeInternal, so the counter hits 0 before the disposing call that triggers the DB drop.
- Comments mark this as temporary, which is appropriate.
There was a problem hiding this comment.
If we need to handle shared resources for test class in xUnit, I believe we have to use ClassFixture (initialize the database/collection in ctor and remove them in Dispose). We sure can refactor this outside of this PR, but probably better to have a Jira ticket for that.
There was a problem hiding this comment.
Just read your comment in code - yep, agree. In this commit it's OK to go as is, but let's create a Jira ticket to implement the Fixture.
| outputSerializer = args.SerializerRegistry.GetSerializer<TOutput>(); | ||
| renderedSearchDefinition.Add("returnScope", new BsonDocument { { "path", returnScope.Render(args).FieldName } }); | ||
| searchOptions = searchOptions.Clone(); | ||
| searchOptions.ReturnStoredSource = true; |
There was a problem hiding this comment.
I'm not sure if I like the idea to alter user's provided options silently. I know it's a clone. But may be we should simply document it "if your search request uses returnScope, make sure to set ReturnStoredSource to true." BTW, why do we need to set it to true? Does it described somewhere?
There was a problem hiding this comment.
Also am I right to think that we need all this clone-and-update stuff, only to use in like a 10 lines of code under? May be we can have a local variable instead?
There was a problem hiding this comment.
The reason it gets automatically set is that, conceptually, returnScope is partially an enhancement to ReturnStoredSource. So if you use returnScope, then you are implicitly using ReturnStoredSource. So if we let the server throw, we're basically saying, "We know you're using stored source, because you're using returnScope, but you didn't set the other flag to tell us this, so we're now going to fail." This is generally bad API design.
On the other hand, if it were reasonable, conceptually, to choose different values here, then it would be appropriate to pass whatever value is set through to the server and let it decide if the combination is valid or not. We can still do this, it would just be slightly annoying to users specifying essentially the same thing twice.
Since it's not clear which approach we should take here, I'll switch to the second approach, which is less code for us, and isn't a breaking change if, somehow, both are allowed in the future.
There was a problem hiding this comment.
The clone is needed to avoid mutating the settings that the application has a reference to, which would be very bad form. (Hence settings are very often immutable.) I don't follow what you mean about using a local variable.
There was a problem hiding this comment.
About local variable. What if instead of clone and update options approach we will go something like:
var shouldReturnStoredSource = options.ReturnStoredSource;
if (returnScope != null)
{
shouldReturnStoredSource = true;
}
... skipped lines until 1490 where we actually render the value.
renderedSearchDefinition.Add("returnStoredSource", shouldReturnStoredSource, shouldReturnStoredSource);
This way we do not need to clone, but we will include the returnStoredSource automatically.
There was a problem hiding this comment.
Yeah, we could do that in this case. But since I changed the code here, we don't need to clone anymore. Nevertheless, in general settings should always be immutable or have a way of cloning, because modifying base settings without clobbering the base is a commonly needed pattern. Ideally, settings would be immutable, and hence we wouldn't be having this discussion.
…s Search Add hasAncestor and hasRoot search operators for use within embeddedDocument queries, allowing searches to reference fields at ancestor or root document levels. Add returnScope support to $search and $searchMeta pipeline stages, enabling results to be returned from a nested embedded document scope rather than the root document. Add Clone() to SearchOptions and its nested options classes to avoid mutating caller-supplied options when returnScope is set.
| /// Creates a clone of the options. | ||
| /// </summary> | ||
| /// <returns>A clone of the options.</returns> | ||
| public SearchOptions<TDocument> Clone() |
There was a problem hiding this comment.
Do we still need this and other added Clone methods as I cannot find if it's still in use?
There was a problem hiding this comment.
No, we don't "need" it, but having settings that are mutable and for which there is no way to clone is an anti-pattern.
Add hasAncestor and hasRoot search operators for use within embeddedDocument queries, allowing searches to reference fields at ancestor or root document levels. Add returnScope support to $search and $searchMeta pipeline stages, enabling results to be returned from a nested embedded document scope rather than the root document. Add Clone() to SearchOptions and its nested options classes to avoid mutating caller-supplied options when returnScope is set.