Skip to content

CSHARP-5769: Implement hasAncestor, hasRoot, and returnScope for Atlas Search#1933

Open
ajcvickers wants to merge 3 commits intomongodb:mainfrom
ajcvickers:CSHARP-5769
Open

CSHARP-5769: Implement hasAncestor, hasRoot, and returnScope for Atlas Search#1933
ajcvickers wants to merge 3 commits intomongodb:mainfrom
ajcvickers:CSHARP-5769

Conversation

@ajcvickers
Copy link
Copy Markdown
Contributor

Add hasAncestor and hasRoot search operators for use within embeddedDocument queries, allowing searches to reference fields at ancestor or root document levels. Add returnScope support to $search and $searchMeta pipeline stages, enabling results to be returned from a nested embedded document scope rather than the root document. Add Clone() to SearchOptions and its nested options classes to avoid mutating caller-supplied options when returnScope is set.

@ajcvickers ajcvickers requested a review from Copilot April 2, 2026 10:26
@ajcvickers ajcvickers added the feature Adds new user-facing functionality. label Apr 2, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the driver’s Atlas Search support by adding hasAncestor/hasRoot operators (for embedded document queries) and returnScope support for $search / $searchMeta, including option cloning to prevent mutating caller-provided SearchOptions when returnScope is set.

Changes:

  • Add HasAncestor and HasRoot search operators and rendering support.
  • Add returnScope overloads for $search and $searchMeta across pipeline + fluent APIs.
  • Introduce Clone() on SearchOptions and nested option classes; add/extend Atlas Search tests and utilities.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/MongoDB.Driver.Tests/Search/VectorSearchTests.cs Update test client creation to match new tuple-returning helper.
tests/MongoDB.Driver.Tests/Search/SearchDefinitionBuilderTests.cs Add rendering tests for HasAncestor / HasRoot.
tests/MongoDB.Driver.Tests/Search/AutoEmbedVectorSearchTests.cs Update test client creation to match new tuple-returning helper.
tests/MongoDB.Driver.Tests/Search/AtlasSearchTestsUtils.cs Return (IMongoClient, EventCapturer) and configure command capture for aggregates.
tests/MongoDB.Driver.Tests/Search/AtlasSearchTests.cs Add integration tests for HasAncestor/HasRoot in embeddedDocument and for returnScope in $search/$searchMeta; add test data/index setup and captured-stage validation.
src/MongoDB.Driver/Search/SearchTrackingOptions.cs Add Clone() to support safe option duplication.
src/MongoDB.Driver/Search/SearchOptions.cs Add Clone() and clone nested option objects.
src/MongoDB.Driver/Search/SearchHighlightOptions.cs Add Clone() for highlight options.
src/MongoDB.Driver/Search/SearchDefinitionBuilder.cs Add HasAncestor / HasRoot builder methods.
src/MongoDB.Driver/Search/SearchDefinition.cs Extend operator type enum with HasAncestor / HasRoot.
src/MongoDB.Driver/Search/SearchCountOptions.cs Add Clone() for count options.
src/MongoDB.Driver/Search/OperatorSearchDefinitions.cs Implement HasAncestorSearchDefinition / HasRootSearchDefinition rendering (incl. prefix reset behavior).
src/MongoDB.Driver/PipelineStageDefinitionBuilder.cs Add $search / $searchMeta overloads with returnScope and option cloning/forced stored source.
src/MongoDB.Driver/PipelineDefinitionBuilder.cs Add pipeline extension overloads for returnScope search/searchMeta.
src/MongoDB.Driver/IAggregateFluent.cs Add fluent overloads for returnScope search/searchMeta.
src/MongoDB.Driver/AggregateFluentBase.cs Add virtual stubs + expression convenience overloads for returnScope.
src/MongoDB.Driver/AggregateFluent.cs Implement new fluent overloads delegating to pipeline extensions.
Agents.md Add repo/stack/commands/env-var guidance for agents.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/MongoDB.Driver/PipelineStageDefinitionBuilder.cs Outdated
Comment thread src/MongoDB.Driver/PipelineStageDefinitionBuilder.cs
Comment thread src/MongoDB.Driver/PipelineStageDefinitionBuilder.cs
Comment thread tests/MongoDB.Driver.Tests/Search/AtlasSearchTests.cs
Comment thread Agents.md Outdated
Comment thread Agents.md Outdated
Comment thread src/MongoDB.Driver/IAggregateFluent.cs Outdated
Comment thread src/MongoDB.Driver/IAggregateFluent.cs Outdated
@ajcvickers ajcvickers marked this pull request as ready for review April 13, 2026 16:57
@ajcvickers ajcvickers requested a review from a team as a code owner April 13, 2026 16:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

/// Creates a clone of the options.
/// </summary>
/// <returns>A clone of the options.</returns>
public SearchTrackingOptions Clone() => new() { SearchTerms = SearchTerms };
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SearchTrackingOptions.Clone() sets SearchTerms via the public setter. If SearchTerms was never set (it can be null today and Render() will simply omit it), the setter will throw due to Ensure.IsNotNullOrEmpty. This makes cloning (and anything that clones SearchOptions.Tracking) fail unexpectedly. Consider cloning the backing field directly or only assigning SearchTerms when it is non-null/non-empty so Clone() preserves current semantics.

Suggested change
public SearchTrackingOptions Clone() => new() { SearchTerms = SearchTerms };
public SearchTrackingOptions Clone() => new() { _searchTerms = _searchTerms };

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

/// </summary>
/// <typeparam name="TField">The type of the ancestor documents.</typeparam>
/// <param name="path">The path from the root to the ancestor.</param>
/// <param name="operator">The operator to execute at the root.</param>
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML docs for HasAncestor say the inner operator is executed "at the root", but this operator is executed in the ancestor context (while HasRoot is the one that moves to the root). Please update the param description to avoid misleading API consumers.

Suggested change
/// <param name="operator">The operator to execute at the root.</param>
/// <param name="operator">The operator to execute in the ancestor context specified by <paramref name="path"/>.</param>

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment thread Agents.md Outdated
Comment thread src/MongoDB.Driver/IAggregateFluent.cs Outdated
Comment thread tests/MongoDB.Driver.Tests/Search/AtlasSearchTests.cs Outdated
private readonly IMongoClient _mongoClient;
private readonly EventCapturer _eventCapturer;
private static bool __indexesCreated;
private static readonly string __databaseUniquifier = Guid.NewGuid().ToString();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need that unique suffix in the database name? Is this cluster shared and permanent? If so - should we drop the generated collection when tests are done?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to move in the direction of proper shared resource management, without any needed pre-configuration. The current code isn't very clean in this respect because it's not currently the way these tests are set up. For now, I can make sure that the database is deleted after all the tests that use it have run. (I generally don't like this approach for a few reasons, but it would be okay here for now.) The database name needs to be unique because if the same tests are immediately run again (e.g. Run until Fail in Rider), then the server can get confused between the old and the new database--we saw this a lot on EF tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a temporary cleanup mechanism. Claude says:

  1. Countdown cleanup pattern: The __returnScopeTestCount = 12 counter, decremented in try/finally blocks, and checked in DisposeInternal, is a workable but fragile approach:
    • The hardcoded 12 must be kept in sync with the test count manually.
    • The decrement/disposal comments (e.g., "3 & 4") correctly account for [Theory] running twice.
    • No thread-safety (non-volatile, no Interlocked), but fine for sequential test execution.
    • The DisposeInternal check on the counter is sound: the finally blocks run before DisposeInternal, so the counter hits 0 before the disposing call that triggers the DB drop.
    • Comments mark this as temporary, which is appropriate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to handle shared resources for test class in xUnit, I believe we have to use ClassFixture (initialize the database/collection in ctor and remove them in Dispose). We sure can refactor this outside of this PR, but probably better to have a Jira ticket for that.

Copy link
Copy Markdown
Member

@sanych-sun sanych-sun Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just read your comment in code - yep, agree. In this commit it's OK to go as is, but let's create a Jira ticket to implement the Fixture.

Comment thread src/MongoDB.Driver/PipelineStageDefinitionBuilder.cs Outdated
outputSerializer = args.SerializerRegistry.GetSerializer<TOutput>();
renderedSearchDefinition.Add("returnScope", new BsonDocument { { "path", returnScope.Render(args).FieldName } });
searchOptions = searchOptions.Clone();
searchOptions.ReturnStoredSource = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I like the idea to alter user's provided options silently. I know it's a clone. But may be we should simply document it "if your search request uses returnScope, make sure to set ReturnStoredSource to true." BTW, why do we need to set it to true? Does it described somewhere?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also am I right to think that we need all this clone-and-update stuff, only to use in like a 10 lines of code under? May be we can have a local variable instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it gets automatically set is that, conceptually, returnScope is partially an enhancement to ReturnStoredSource. So if you use returnScope, then you are implicitly using ReturnStoredSource. So if we let the server throw, we're basically saying, "We know you're using stored source, because you're using returnScope, but you didn't set the other flag to tell us this, so we're now going to fail." This is generally bad API design.

On the other hand, if it were reasonable, conceptually, to choose different values here, then it would be appropriate to pass whatever value is set through to the server and let it decide if the combination is valid or not. We can still do this, it would just be slightly annoying to users specifying essentially the same thing twice.

Since it's not clear which approach we should take here, I'll switch to the second approach, which is less code for us, and isn't a breaking change if, somehow, both are allowed in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clone is needed to avoid mutating the settings that the application has a reference to, which would be very bad form. (Hence settings are very often immutable.) I don't follow what you mean about using a local variable.

Copy link
Copy Markdown
Member

@sanych-sun sanych-sun Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About local variable. What if instead of clone and update options approach we will go something like:

var shouldReturnStoredSource = options.ReturnStoredSource;
if (returnScope != null)
{
   shouldReturnStoredSource = true;
}

... skipped lines until 1490 where we actually render the value.
renderedSearchDefinition.Add("returnStoredSource", shouldReturnStoredSource, shouldReturnStoredSource);

This way we do not need to clone, but we will include the returnStoredSource automatically.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could do that in this case. But since I changed the code here, we don't need to clone anymore. Nevertheless, in general settings should always be immutable or have a way of cloning, because modifying base settings without clobbering the base is a commonly needed pattern. Ideally, settings would be immutable, and hence we wouldn't be having this discussion.

…s Search

Add hasAncestor and hasRoot search operators for use within embeddedDocument
queries, allowing searches to reference fields at ancestor or root document
levels. Add returnScope support to $search and $searchMeta pipeline stages,
enabling results to be returned from a nested embedded document scope rather
than the root document. Add Clone() to SearchOptions and its nested options
classes to avoid mutating caller-supplied options when returnScope is set.
@ajcvickers ajcvickers requested review from sanych-sun and removed request for kyra-rk April 14, 2026 13:15
/// Creates a clone of the options.
/// </summary>
/// <returns>A clone of the options.</returns>
public SearchOptions<TDocument> Clone()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this and other added Clone methods as I cannot find if it's still in use?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't "need" it, but having settings that are mutable and for which there is no way to clone is an anti-pattern.

@ajcvickers ajcvickers requested a review from sanych-sun April 15, 2026 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Adds new user-facing functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants