From 3763d17f7a5766a0617f3adf030cb98dd7bba469 Mon Sep 17 00:00:00 2001 From: Jialiang Liang Date: Thu, 7 May 2026 14:04:09 -0700 Subject: [PATCH 1/3] Default plugins.ppl.rex.max_match.limit=10 on the unified query path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The PPL `rex` command's AstBuilder reads `Settings.Key.PPL_REX_MAX_MATCH_LIMIT` unconditionally and unboxes the result to `int`: int maxMatchLimit = (settings != null) ? settings.getSettingValue(...) : 10; The `(settings != null)` guard only protects against the Settings instance being null — not against `getSettingValue` returning null for a key that the caller never registered. On the unified query path, `UnifiedQueryContext` builds its `Settings` map with only a small whitelist of keys (e.g. `QUERY_SIZE_LIMIT`, `CALCITE_ENGINE_ENABLED` per #5413). For any unregistered key, `getSettingValue` returns null, and the auto-unbox NPEs the planner before any operator-level capability check runs. Every `rex` query through `/_analytics/ppl` (the analytics-engine route's REST front-end) hits this NPE today. Default `PPL_REX_MAX_MATCH_LIMIT=10` in `buildSettings()` so unified-path behavior matches the cluster-side default registered by `OpenSearchSettings.PPL_REX_MAX_MATCH_LIMIT_SETTING` — making the v2 path and the analytics-engine path agree on the same fallback value when neither has an explicit cluster override. Mirrors the precedent Kai introduced for `CALCITE_ENGINE_ENABLED` in #5413. Companion to the OpenSearch core PR onboarding PPL `rex mode=sed` to the analytics-engine route via DataFusion (Part 1 — sed-mode bridge only; extract-mode Rust UDFs deferred to Part 2). Signed-off-by: Jialiang Liang --- .../org/opensearch/sql/api/UnifiedQueryContext.java | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java b/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java index 52b840388d..b35b674ac1 100644 --- a/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java +++ b/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java @@ -7,6 +7,7 @@ import static org.opensearch.sql.common.setting.Settings.Key.CALCITE_ENGINE_ENABLED; import static org.opensearch.sql.common.setting.Settings.Key.PPL_JOIN_SUBSEARCH_MAXOUT; +import static org.opensearch.sql.common.setting.Settings.Key.PPL_REX_MAX_MATCH_LIMIT; import static org.opensearch.sql.common.setting.Settings.Key.PPL_SUBSEARCH_MAXOUT; import static org.opensearch.sql.common.setting.Settings.Key.QUERY_SIZE_LIMIT; @@ -127,6 +128,13 @@ public static class Builder { * org.opensearch.sql.api.parser.PPLQueryParser} reuses the v2 {@code AstBuilder}, which gates * Calcite-only commands (e.g. {@code visitTableCommand}) on this setting; without the default, * those commands fail at parse time even when the cluster setting is true. + * + *

{@link Settings.Key#PPL_REX_MAX_MATCH_LIMIT} defaults to {@code 10} here because {@code + * AstBuilder.visitRexCommand} reads it unconditionally and unboxes to {@code int} — a {@code + * null} return from {@code getSettingValue} NPEs the planner before any operator- level + * capability check runs. The value mirrors the cluster-side default of {@code 10} registered by + * {@code OpenSearchSettings.PPL_REX_MAX_MATCH_LIMIT_SETTING}, so unified-path behavior matches + * v2-path behavior when neither has an explicit cluster override. */ private final Map settings = new HashMap( @@ -134,7 +142,8 @@ public static class Builder { QUERY_SIZE_LIMIT, SysLimit.DEFAULT.querySizeLimit(), PPL_SUBSEARCH_MAXOUT, SysLimit.DEFAULT.subsearchLimit(), PPL_JOIN_SUBSEARCH_MAXOUT, SysLimit.DEFAULT.joinSubsearchLimit(), - CALCITE_ENGINE_ENABLED, true)); + CALCITE_ENGINE_ENABLED, true, + PPL_REX_MAX_MATCH_LIMIT, 10)); /** * Sets the query language frontend to be used. From a74bbaf54f879381c9209a3687f75919d5625aae Mon Sep 17 00:00:00 2001 From: Jialiang Liang Date: Thu, 7 May 2026 22:23:02 -0700 Subject: [PATCH 2/3] Bridge cluster overrides for PPL_REX_MAX_MATCH_LIMIT via the existing setting() API MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous commit defaulted `PPL_REX_MAX_MATCH_LIMIT=10` in `UnifiedQueryContext.Builder.settings` to fix the NPE in `AstBuilder.visitRexCommand` on the unified path. The default is correct, but it doesn't respect mid-run cluster overrides — every key in the static map returns its hardcoded value regardless of `_cluster/settings` updates. This breaks `CalciteRexCommandIT.testRexMaxMatchConfigurableLimit`, which sets the cluster-side limit to 5 and asserts `max_match=0` caps at 5; on the unified path it stayed at 10. Rather than introducing a new `Settings`-typed Builder API, the REST handler reads the live cluster value itself and routes it through the existing `Builder.setting(String, Object)` method — keeping `UnifiedQueryContext` decoupled from any specific `Settings` implementation: RestUnifiedQueryAction.applyClusterOverrides(builder) └── pluginSettings.getSettingValue(PPL_REX_MAX_MATCH_LIMIT) └── builder.setting("plugins.ppl.rex.max_match.limit", value) `RestUnifiedQueryAction` gains a `pluginSettings` field (the same `OpenSearchSettings` instance bound in the Guice module). Both construction sites — `SQLPlugin.createSqlAnalyticsRouter` and `TransportPPLQueryAction.` — pass it through. `RestUnifiedQueryActionTest` updated to pass a `mock(Settings.class)` for the new constructor parameter. ## Why scoped to PPL_REX_MAX_MATCH_LIMIT only The same architectural gap exists for every key in the static defaults map (`QUERY_SIZE_LIMIT`, `PPL_SUBSEARCH_MAXOUT`, `PPL_JOIN_SUBSEARCH_MAXOUT`, `CALCITE_ENGINE_ENABLED`). For three of those, the static defaults are fine in practice (no test overrides them mid-run; `head N` covers `QUERY_SIZE_LIMIT` per-query). `CALCITE_ENGINE_ENABLED` is intentionally pinned to `true` for the unified path. So this PR widens only the one key that demonstrably needs it; widening the snapshot to the rest is a future scope decision tied to whichever new IT first depends on it. Signed-off-by: Jialiang Liang --- .../sql/api/UnifiedQueryContext.java | 9 ++-- .../org/opensearch/sql/plugin/SQLPlugin.java | 3 +- .../plugin/rest/RestUnifiedQueryAction.java | 42 +++++++++++++++---- .../transport/TransportPPLQueryAction.java | 10 +++-- .../rest/RestUnifiedQueryActionTest.java | 4 +- 5 files changed, 52 insertions(+), 16 deletions(-) diff --git a/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java b/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java index b35b674ac1..8df9c519f5 100644 --- a/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java +++ b/api/src/main/java/org/opensearch/sql/api/UnifiedQueryContext.java @@ -131,10 +131,12 @@ public static class Builder { * *

{@link Settings.Key#PPL_REX_MAX_MATCH_LIMIT} defaults to {@code 10} here because {@code * AstBuilder.visitRexCommand} reads it unconditionally and unboxes to {@code int} — a {@code - * null} return from {@code getSettingValue} NPEs the planner before any operator- level + * null} return from {@code getSettingValue} NPEs the planner before any operator-level * capability check runs. The value mirrors the cluster-side default of {@code 10} registered by - * {@code OpenSearchSettings.PPL_REX_MAX_MATCH_LIMIT_SETTING}, so unified-path behavior matches - * v2-path behavior when neither has an explicit cluster override. + * {@code OpenSearchSettings.PPL_REX_MAX_MATCH_LIMIT_SETTING}. Cluster-side overrides reach this + * map via {@link #setting(String, Object)} — the REST handler reads the live value from {@code + * OpenSearchSettings} and routes it through that existing API, keeping {@link + * UnifiedQueryContext} decoupled from any specific {@link Settings} implementation. */ private final Map settings = new HashMap( @@ -232,6 +234,7 @@ public UnifiedQueryContext build() { case SQL -> UnifiedSqlSpec.extended(); case PPL -> UnifiedPplSpec.create(); }; + Settings settings = buildSettings(); CalcitePlanContext planContext = CalcitePlanContext.create( diff --git a/plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java b/plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java index 41a6d8c486..e7dd3dbc77 100644 --- a/plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java +++ b/plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java @@ -208,7 +208,8 @@ private BiFunction createSqlAnalyticsRout if (executor == null) { return null; } - cached[0] = new RestUnifiedQueryAction(client, clusterService, executor); + cached[0] = + new RestUnifiedQueryAction(client, clusterService, executor, pluginSettings); } return cached[0]; }; diff --git a/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java b/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java index acd50ac8b1..0531cbde51 100644 --- a/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java +++ b/plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java @@ -59,14 +59,17 @@ public class RestUnifiedQueryAction { private final AnalyticsExecutionEngine analyticsEngine; private final NodeClient client; private final ClusterService clusterService; + private final org.opensearch.sql.common.setting.Settings pluginSettings; public RestUnifiedQueryAction( NodeClient client, ClusterService clusterService, - QueryPlanExecutor> planExecutor) { + QueryPlanExecutor> planExecutor, + org.opensearch.sql.common.setting.Settings pluginSettings) { this.client = client; this.clusterService = clusterService; this.analyticsEngine = new AnalyticsExecutionEngine(planExecutor); + this.pluginSettings = pluginSettings; } /** @@ -154,19 +157,42 @@ public void explain( * Build a lightweight context for parsing only (index name extraction). Does not require cluster * state or catalog schema. */ - private static UnifiedQueryContext buildParsingContext(QueryType queryType) { - return UnifiedQueryContext.builder().language(queryType).build(); + private UnifiedQueryContext buildParsingContext(QueryType queryType) { + return applyClusterOverrides(UnifiedQueryContext.builder().language(queryType)).build(); } private UnifiedQueryContext buildContext(QueryType queryType, boolean profiling) { - return UnifiedQueryContext.builder() - .language(queryType) - .catalog(SCHEMA_NAME, OpenSearchSchemaBuilder.buildSchema(clusterService.state())) - .defaultNamespace(SCHEMA_NAME) - .profiling(profiling) + return applyClusterOverrides( + UnifiedQueryContext.builder() + .language(queryType) + .catalog(SCHEMA_NAME, OpenSearchSchemaBuilder.buildSchema(clusterService.state())) + .defaultNamespace(SCHEMA_NAME) + .profiling(profiling)) .build(); } + /** + * Routes operator-configured cluster overrides into the builder via the existing {@code + * setting(String, Object)} API, keeping {@link UnifiedQueryContext} decoupled from any specific + * {@link org.opensearch.sql.common.setting.Settings} implementation. + * + *

Currently scoped to {@code plugins.ppl.rex.max_match.limit} — required so the unified path + * honors {@code _cluster/settings} updates for {@code rex max_match} (CalciteRexCommandIT's + * testRexMaxMatchConfigurableLimit). Add keys here if a future PR / IT depends on cluster-side + * fidelity for one of the other planning settings. + */ + private UnifiedQueryContext.Builder applyClusterOverrides(UnifiedQueryContext.Builder builder) { + Object rexLimit = + pluginSettings.getSettingValue( + org.opensearch.sql.common.setting.Settings.Key.PPL_REX_MAX_MATCH_LIMIT); + if (rexLimit != null) { + builder.setting( + org.opensearch.sql.common.setting.Settings.Key.PPL_REX_MAX_MATCH_LIMIT.getKeyValue(), + rexLimit); + } + return builder; + } + /** * Extract the source index name by parsing the query and visiting the AST to find the Relation * node. Uses the context's parser which supports both PPL and SQL. diff --git a/plugin/src/main/java/org/opensearch/sql/plugin/transport/TransportPPLQueryAction.java b/plugin/src/main/java/org/opensearch/sql/plugin/transport/TransportPPLQueryAction.java index c148e836d8..365f1b2681 100644 --- a/plugin/src/main/java/org/opensearch/sql/plugin/transport/TransportPPLQueryAction.java +++ b/plugin/src/main/java/org/opensearch/sql/plugin/transport/TransportPPLQueryAction.java @@ -67,6 +67,7 @@ public class TransportPPLQueryAction private final NodeClient clientRef; private final ClusterService clusterServiceRef; + private final org.opensearch.sql.common.setting.Settings pluginSettingsRef; @Inject public TransportPPLQueryAction( @@ -82,11 +83,13 @@ public TransportPPLQueryAction( ModulesBuilder modules = new ModulesBuilder(); modules.add(new OpenSearchPluginModule()); + org.opensearch.sql.common.setting.Settings pluginSettings = + new OpenSearchSettings(clusterService.getClusterSettings()); + this.pluginSettingsRef = pluginSettings; modules.add( b -> { b.bind(NodeClient.class).toInstance(client); - b.bind(org.opensearch.sql.common.setting.Settings.class) - .toInstance(new OpenSearchSettings(clusterService.getClusterSettings())); + b.bind(org.opensearch.sql.common.setting.Settings.class).toInstance(pluginSettings); b.bind(DataSourceService.class).toInstance(dataSourceService); }); this.injector = Guice.createInjector(modules); @@ -105,7 +108,8 @@ public void setQueryPlanExecutor( QueryPlanExecutor> queryPlanExecutor) { AnalyticsExecutorHolder.set(queryPlanExecutor); this.unifiedQueryHandler = - new RestUnifiedQueryAction(clientRef, clusterServiceRef, queryPlanExecutor); + new RestUnifiedQueryAction( + clientRef, clusterServiceRef, queryPlanExecutor, pluginSettingsRef); } /** diff --git a/plugin/src/test/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryActionTest.java b/plugin/src/test/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryActionTest.java index c05a34128d..60e0b0bf76 100644 --- a/plugin/src/test/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryActionTest.java +++ b/plugin/src/test/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryActionTest.java @@ -14,6 +14,7 @@ import org.junit.Test; import org.opensearch.analytics.exec.QueryPlanExecutor; import org.opensearch.cluster.service.ClusterService; +import org.opensearch.sql.common.setting.Settings; import org.opensearch.sql.executor.QueryType; import org.opensearch.transport.client.node.NodeClient; @@ -30,7 +31,8 @@ public void setUp() { @SuppressWarnings("unchecked") QueryPlanExecutor> executor = mock(QueryPlanExecutor.class); action = - new RestUnifiedQueryAction(mock(NodeClient.class), mock(ClusterService.class), executor); + new RestUnifiedQueryAction( + mock(NodeClient.class), mock(ClusterService.class), executor, mock(Settings.class)); } @Test From aa0ae8252fe14bce8a7960a618b17a2dcf585908 Mon Sep 17 00:00:00 2001 From: Jialiang Liang Date: Thu, 7 May 2026 22:23:02 -0700 Subject: [PATCH 3/3] Cover PPL_REX_MAX_MATCH_LIMIT in UnifiedQueryContextTest MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pins two behaviors the previous commits introduced: * `testContextCreationWithDefaults` now asserts that `PPL_REX_MAX_MATCH_LIMIT` resolves to its static default of 10 — the fallback value `AstBuilder.visitRexCommand` reads when no cluster-side override is present. * `testContextCreationWithCustomConfig` now asserts that `setting("plugins.ppl.rex.max_match.limit", 5)` reaches `getSettingValue(PPL_REX_MAX_MATCH_LIMIT)` — the path the REST handler uses to forward an operator-configured cluster value into the unified-path settings map. Folds the two assertions into the existing default / custom-config tests rather than adding new test methods, per review feedback. Signed-off-by: Jialiang Liang --- .../org/opensearch/sql/api/UnifiedQueryContextTest.java | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/api/src/test/java/org/opensearch/sql/api/UnifiedQueryContextTest.java b/api/src/test/java/org/opensearch/sql/api/UnifiedQueryContextTest.java index ad2eba0fea..f0111d0636 100644 --- a/api/src/test/java/org/opensearch/sql/api/UnifiedQueryContextTest.java +++ b/api/src/test/java/org/opensearch/sql/api/UnifiedQueryContextTest.java @@ -33,6 +33,10 @@ public void testContextCreationWithDefaults() { "Settings should have default system limits", SysLimit.DEFAULT, SysLimit.fromSettings(context.getSettings())); + assertEquals( + "PPL_REX_MAX_MATCH_LIMIT default should be 10", + Integer.valueOf(10), + context.getSettings().getSettingValue(PPL_REX_MAX_MATCH_LIMIT)); } @Test @@ -43,10 +47,15 @@ public void testContextCreationWithCustomConfig() { .catalog("opensearch", testSchema) .cacheMetadata(true) .setting("plugins.query.size_limit", 200) + .setting("plugins.ppl.rex.max_match.limit", 5) .build(); Integer querySizeLimit = context.getSettings().getSettingValue(QUERY_SIZE_LIMIT); assertEquals("Custom setting should be applied", Integer.valueOf(200), querySizeLimit); + assertEquals( + "Cluster-side override for PPL_REX_MAX_MATCH_LIMIT should reach the unified path", + Integer.valueOf(5), + context.getSettings().getSettingValue(PPL_REX_MAX_MATCH_LIMIT)); } @Test(expected = IllegalArgumentException.class)