Introduce DataSourceResolver for multi-datasource support in JDBC per… by Subham-KRLX · Pull Request #3960 · apache/polaris

Subham-KRLX · 2026-03-09T18:31:30Z

Introduces a DataSourceResolver interface to decouple DataSource selection from JdbcMetaStoreManagerFactory, enabling per-realm and per-store-type routing (main, metrics, events).

Changes:

New DataSourceResolver interface with resolve(realmId, storeType)
DefaultDataSourceResolver — routes everything to the default DS (no behavior change)
JdbcMetaStoreManagerFactory now uses DataSourceResolver instead of direct Instance<DataSource>

This is the foundation for #3890. Named DS resolution and config properties will follow.

Closes #3890

Checklist

🛡️ Don't disclose security issues!
🔗 Linked: Fixes Support multiple data sources in JDBC persistence #3890
🧪 Tests to be added in follow-up
💡 Added comments for complex logic

dimas-b · 2026-03-09T18:33:44Z

@Subham-KRLX : thanks for the proposal... I'll review presently, but please fix conflicts to allow CI to run.

…sistence This commit introduces the DataSourceResolver interface and a DefaultDataSourceResolver implementation to enable flexible routing of DataSources based on realm and store type (e.g., main, metrics, events). Key changes: - New DataSourceResolver interface with resolve(realmId, storeType) and getAllUniqueDataSources() methods - DefaultDataSourceResolver that routes all requests to the default DataSource (backward-compatible no-op) - Refactored JdbcMetaStoreManagerFactory to use DataSourceResolver instead of direct DataSource injection This lays the groundwork for isolating different persistence workloads (entity metadata vs metrics vs events) into separate connection pools or databases to mitigate noisy neighbor effects. Closes apache#3890

dimas-b

Very nice idea, @Subham-KRLX ! +1 to this approach in general.

Posting quite a few comments, but they are just to "streamline" the code (I hope).

dimas-b · 2026-03-09T18:34:23Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+ */
+public interface DataSourceResolver {
+
+    String STORE_TYPE_MAIN = "main";


Maybe an enum?

I have refactored the store types into the StoreType enum as suggested. This makes the resolution logic much cleaner and more type-safe.

dimas-b · 2026-03-09T18:36:24Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+     *
+     * @return a set of all DataSources
+     */
+    Set<DataSource> getAllUniqueDataSources();


It's fine to have the "list all" method, but I tend to think it needs a separate interface because it's a different use case.

Specifically, providing a DataSource per realm does not have to be co-location with the code to list all data sources. The former is a Server runtime concerns, the later is more of an Admin tool concern.

dimas-b · 2026-03-09T18:37:34Z

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

+
+    private static final Logger LOGGER = LoggerFactory.getLogger(DefaultDataSourceResolver.class);
+
+    private final Instance<DataSource> defaultDataSource;


Does this have to be an Instance? Why not use the plain DataSource as type here?

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

- Refactored StoreType to an enum in DataSourceResolver - Removed getAllUniqueDataSources() from DataSourceResolver (Admin use case) - Moved DefaultDataSourceResolver to runtime/common/config/jdbc - Inject plain DataSource instead of Instance<DataSource> in DefaultDataSourceResolver - Updated JdbcMetaStoreManagerFactory to use the new StoreType enum

Subham-KRLX · 2026-03-09T18:59:05Z

@dimas-b thanks for the review I have streamlined the implementation based on your feedback by refactoring StoreType into an enum within DataSourceResolver and removing the getAllUniqueDataSources() method to keep the interface focused strictly on runtime resolution. Additionally I moved DefaultDataSourceResolver
to the runtime-common module to better decouple the persistence layer from resolution logic and simplified the implementation by injecting a plain DataSource instead of an Instance.

dimas-b

A couple more comments, but LGTM overall 👍

@Subham-KRLX : Do you want to remove the "draft" status and open this for general review?

I'd suggest to also send an email about this to dev for visibility.

dimas-b · 2026-03-09T19:00:31Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+
+    /** The type of store representing the workload pattern. */
+    enum StoreType {
+        MAIN,


nit: MAIN -> METASTORE?

dimas-b · 2026-03-09T19:01:23Z

...rc/main/java/org/apache/polaris/persistence/relational/jdbc/JdbcMetaStoreManagerFactory.java

-
-  protected JdbcMetaStoreManagerFactory() {}
+  @Inject
+  Clock clock;


nit: I wonder if the Spotless is going to complain about this 😅

dimas-b · 2026-03-09T19:05:02Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+     * @param storeType the type of store (e.g., main, metrics, events)
+     * @return the resolved DataSource
+     */
+    DataSource resolve(String realmId, StoreType storeType);


I'm thinking that we may want to use RealmContext instead of String realm ID here... just to continue the pattern of using RealmContext in pluggable components... e.g. RealmConfigurationSource , TokenBucketFactory, etc. 🤔

I have applied the final round of nits: renamed MAIN to METASTORE, switched to RealmContext in resolve()
for consistency, and fixed the @Inject field formatting. I'm also sending an email to dev@polaris.apache.org
now to give this more visibility.

@Inject

- Renamed StoreType.MAIN to METASTORE - Use RealmContext instead of String realmId in DataSourceResolver.resolve - Added missing polaris-core dependency to runtime-common - Fixed formatting of @Inject fields for Spotless

dimas-b · 2026-03-09T19:15:21Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+   * @return the resolved DataSource
+   */
+  DataSource resolve(
+      org.apache.polaris.core.context.RealmContext realmContext, StoreType storeType);


nit: import

@dimas-b I have pushed an update to resolve the CI failures. It turned out to be a CDI injection issue with the @Identifier annotation that I have now resolved.

Subham-KRLX · 2026-03-09T19:22:20Z

Thanks for the approval @dimas-b. I have fixed that last import nit and have sent the proposal email to the dev list [1] for broader visibility as you suggested. Appreciate all the guidance on streamlining this.

flyrain · 2026-03-09T23:35:56Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+   * @param storeType    the type of store (e.g., main, metrics, events)
+   * @return the resolved DataSource
+   */
+  DataSource resolve(RealmContext realmContext, StoreType storeType);


I'm supportive on realm, but we will need to think about the storage type.

@flyrain Could you clarify your concerns regarding StoreType? Are you suggesting a more extensible approach (e.g., String identifiers) or perhaps routing at a different layer? Happy to refine this to better fit the project's direction.

dimas-b · 2026-03-10T14:32:52Z

@Subham-KRLX : the PR LGTM, but please check CI failures out 😅

…solution

Subham-KRLX · 2026-03-11T07:34:55Z

@dimas-b @flyrain I’ve pushed an update to fix the CI failures in the non-JDBC test suites. The root cause was DefaultDataSourceResolver unconditionally injecting a JDBC DataSource bean, which caused an io.quarkus.arc.InactiveBeanException when booting profiles where JDBC is disabled (like NoSQL/Mongo). I’ve updated the implementation to use Instance for lazy resolution, allowing the service to boot safely across all persistence modes. All CI checks should now pass cleanly.

dimas-b · 2026-03-11T17:26:41Z

[...] The root cause was DefaultDataSourceResolver unconditionally injecting a JDBC DataSource bean, which caused an io.quarkus.arc.InactiveBeanException when booting profiles where JDBC is disabled (like NoSQL/Mongo).

Current code in this PR LGTM, however, I still wonder why would DefaultDataSourceResolver be "engaged" in non-JDBC contexts (and subsequently get the InactiveBeanException) 🤔

@Subham-KRLX : Could you explore this a bit more?

Perhaps DefaultDataSourceResolver may need to be a @Dependent bean rather than @ApplicationScoped 🤔

flyrain · 2026-03-11T18:43:05Z

As I said in the mailing thread https://lists.apache.org/thread/nokyythvrdzsfwz26hx0w5rpxrw5wjtd, I think this effort deserve more thoughts before we start coding. cc @singhpk234

flyingImer · 2026-03-12T04:09:56Z

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

+ * single default {@link DataSource}. This serves as both the production default and the base for
+ * multi-datasource extensions.
+ */
+@ApplicationScoped


IIUC, the current extension story is still a bit fragile here. This default resolver is just an unqualified @ApplicationScoped bean, while the factory later resolves DataSourceResolver via Instance<DataSourceResolver>.get(). Would a custom unqualified resolver now turn this into ambiguous CDI resolution rather than a clean override? If this is meant to be a public SPI, I wonder if we should make the replacement model explicit up front.

Whoever introduces an alternative implementation for DataSourceResolver should provide a custom producer method or annotate the preferred bean with a higher @Priority than this default.

This is a common pattern in Polaris in a few other extension points.

I'd rather use @Identifier.

Ok, let' use @Identifier. In this case, I suppose @ApplicationScoped will move to the producer method, which will resolve the identifier based on runtime config.

flyingImer · 2026-03-12T04:09:56Z

...rc/main/java/org/apache/polaris/persistence/relational/jdbc/JdbcMetaStoreManagerFactory.java

  @Inject PolarisStorageIntegrationProvider storageIntegrationProvider;
-  @Inject Instance<DataSource> dataSource;
+
+  @Inject Instance<DataSourceResolver> dataSourceResolver;


Why is this injected as Instance<DataSourceResolver> instead of a plain DataSourceResolver? In the current code we unconditionally call .get(), so it seems we are paying the runtime ambiguity cost without really using dynamic selection. Unless we expect multiple qualified resolvers here, would direct injection give us a cleaner and safer contract?

flyingImer · 2026-03-12T04:09:56Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+ * physical databases or connection pools.
+ */
+public interface DataSourceResolver {
+


I wonder if this contract is getting a bit ahead of the implementation. This PR only wires METASTORE, but the SPI already publishes METRICS and EVENTS as if those routing modes were part of the current supported surface. Would it make sense to keep the first step narrower and add new StoreType values only once the corresponding paths are actually routed through this resolver?

flyingImer · 2026-03-12T04:09:56Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+import org.apache.polaris.core.context.RealmContext;
+
+/**
+ * Service to resolve the correct {@link DataSource} for a given realm and store type. This enables


IIUC, this Javadoc is describing the longer-term shape more than what the PR actually does today. Right now this is a metastore-routing foundation, not a system-wide metadata/metrics/events routing layer. I wonder if tightening the wording to match the current behavior would make the contract less confusing for future implementors/readers.

Thanks for the feedback @flyingImer I completely agree. I have pushed an update that narrows the scope of this first step to strictly cover METASTORE routing, completely removing METRICS and EVENTS from StoreType and updating the Javadoc to reflect this tighter scope. I also removed the unnecessary runtime ambiguity by injecting DataSourceResolver directly instead of Instance. Finally, I added @DefaultBean to DefaultDataSourceResolver so downstream users can cleanly override the routing logic without ambiguous CDI resolution errors.

…ultBean

Subham-KRLX · 2026-03-12T11:18:32Z

[...] The root cause was DefaultDataSourceResolver unconditionally injecting a JDBC DataSource bean, which caused an io.quarkus.arc.InactiveBeanException when booting profiles where JDBC is disabled (like NoSQL/Mongo).

Current code in this PR LGTM, however, I still wonder why would DefaultDataSourceResolver be "engaged" in non-JDBC contexts (and subsequently get the InactiveBeanException) 🤔

@Subham-KRLX : Could you explore this a bit more?

Perhaps DefaultDataSourceResolver may need to be a @Dependent bean rather than @ApplicationScoped 🤔

Hi @dimas-b the root issue is Quarkus ArC's eager boot-time validation.Since DefaultDataSourceResolver
is in runtime-common, it’s always discovered during boot (even in NoSQL profiles). The Quarkus JDBC extension disables the DataSource bean when JDBC is off.
A direct @Inject DataSource causes Quarkus to throw an InactiveBeanException immediately at boot because the required injection point cannot be satisfied. By using @Inject Instance, we make the injection point dynamic/lazy, bypassing the strict upfront validation. (Even if the bean were @dependent, Quarkus ArC still eagerly validates all discovered injection points by default).

dimas-b · 2026-03-12T14:05:44Z

@Subham-KRLX : thanks for your investigation into CDI startup behaviour (above) 👍

dimas-b · 2026-03-12T14:19:58Z

...n/src/main/java/org/apache/polaris/quarkus/common/config/jdbc/DefaultDataSourceResolver.java

+ * provide their own {@link DataSourceResolver} bean to implement custom routing logic.
+ */
+@ApplicationScoped
+@DefaultBean


Good idea 👍

Hmm I'm torn on @DefaultBean. It's not a bad idea per se, but 1) it's a Quarkus-specific, non-standard CDI extension and 2) in Polaris so far we've been using @Identifier instead, and at runtime we would select the bean via a producer method + a configuration option.

@Identifier works too. Yet, in this PR we do not have any alternatives, so using no qualifiers should work too, AFAIK.

Custom implementations can use @Alternative + @Priority to override.

dimas-b · 2026-03-12T17:22:24Z

...al-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DataSourceResolver.java

+
+  /** The type of store representing the workload pattern. */
+  enum StoreType {
+    METASTORE


Sorry, I might have missed this before, but this enum appears to be rather confusing now because JdbcBasePersistenceImpl implements both the MetaStore interfaces and MetricsPersistence.

Given the current state of the JDBC-related code I'd prefer to remove this parameter completely.

We can add it late if/when MetricsPersistence is isolated from MetaStore Persistence.

Agreed, StoreType has been removed entirely in the latest push. As you noted, it doesn't add value while JdbcBasePersistenceImpl handles both MetaStore and MetricsPersistence. We can reintroduce it once the workloads are actually isolated.

adutra · 2026-03-13T08:08:16Z

...n/src/main/java/org/apache/polaris/quarkus/common/config/jdbc/DefaultDataSourceResolver.java

+ * provide their own {@link DataSourceResolver} bean to implement custom routing logic.
+ */
+@ApplicationScoped
+@DefaultBean


Hmm I'm torn on @DefaultBean. It's not a bad idea per se, but 1) it's a Quarkus-specific, non-standard CDI extension and 2) in Polaris so far we've been using @Identifier instead, and at runtime we would select the bean via a producer method + a configuration option.

adutra · 2026-03-13T08:08:33Z

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

+
+  private final Instance<DataSource> defaultDataSource;
+
+  @Inject


You are probably missing the @Any annotation here.

But more broadly: if this impl can only handle one datasource, the default one, why not just inject it directly?

IIRC, there were some issues with bean not being available in non-JDBC situations (e.g. tests).... that's how I interpret @Subham-KRLX 's message : #3960 (comment)

OK makes sense 👍

adutra · 2026-03-13T08:11:10Z

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

+ */
+@ApplicationScoped
+@DefaultBean
+public class DefaultDataSourceResolver implements DataSourceResolver {


I'm not sure I understand why this class was placed in runtime/common. This creates a strange inheritance. This class could live in polaris-relational-jdbc alongside its interface. That module already has application-scoped beans. (The only thing it doesn't have is the @DefaultBean annotation – see my other comment.)

Actually the question of where to place implementations of this interface is a tricky one.

When designing a true multi-datasource implementation, it will need the Quarkus Agroal extension, and therefore, would likely have to live in runtime/service.

It would look like this:

@ApplicationScoped @Identifier("per-realm") public class PerRealmDataSourceResolver implements DataSourceResolver { private static final Logger LOGGER = LoggerFactory.getLogger(DefaultDataSourceResolver.class); @Override public DataSource resolve(RealmContext realmContext, StoreType storeType) { String dataSourceName = findDataSourceName(realmContext, storeType); LOGGER.debug( "Using DataSource '{}' for realm '{}' and store '{}'", dataSourceName, realmContext.getRealmIdentifier(), storeType); return AgroalDataSourceUtil.dataSourceIfActive(dataSourceName) .orElseThrow( () -> new IllegalStateException( "DataSource '" + dataSourceName + "' is not active or does not exist")); } private String findDataSourceName(RealmContext realmContext, StoreType storeType) { ... } }

But the problem is that polaris-relational-jdbc is not in the compile classpath of runtime/service, only in its runtime classpath... Another option is to add the quarkus-agroal extension to polaris-relational-jdbc and make it a quarkus module.

I think this needs some more thinking.

How about we add a new module like polaris-relational-jdbc-runtime with Quarkus + Agroal dependencies?

The idea is for polaris-relational-jdbc to remain free from Quarkus-specific dependencies (Jakarta annotations are ok). IIRC, we tried doing something like that with JDBC persistence initially, but it probably did not materialize completely 😅

If we use @Identifier for selecting a specific DataSourceResolver impl., I suppose the producer code will go into polaris-relational-jdbc-runtime as well.

Downstream alternatives will have the the option of reusing polaris-relational-jdbc-runtime and configuring a custom identifier value, or not including polaris-relational-jdbc-runtime and making a custom producer.

Do we have a compelling reason not to turn polaris-relational-jdbc into a Quarkus module? But if we do, I'm OK with polaris-relational-jdbc-runtime.

Only "general principles" kind of reason 🙂

I tend to think it valuable to isolate runtime / CDI concerns from the basic code / logic of a particular component like JDBC persistence. In that regard, ideally, polaris-relational-jdbc could be reused in any runtime env. (Quarkus or something else). polaris-relational-jdbc-runtime would have the code specific to the way Polaris uses Quarkus.

I believe the NoSQL Persistence impl and the OPA Authorizer follow similar principles (perhaps varying in degree, but similar in essence). I also proposed a similar refactoring for the Admin tool in #3947.

I do not really know whether this matters for downstream Polaris users right now 🤔 🤷 In my personal experience, I find that it is much easier to include another module into a build than struggle with excluding intruding dependencies 😅

I know some concerns were raised about module proliferation in Polaris, but from my POV, Gradle is able to deal with a large number of modules very well, so I personally do not see it as an issue.

Looking at the evolution of this PR (and this class specifically), I think it's probably fine to move CDI/quarkus-related code into the polaris-relational-jdbc module.

adutra · 2026-03-13T08:13:27Z

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

+ * single default {@link DataSource}. This serves as both the production default and the base for
+ * multi-datasource extensions.
+ */
+@ApplicationScoped


I'd rather use @Identifier.

@Identifier

…pattern - Removed StoreType enum from DataSourceResolver interface - Simplified resolve() to accept only RealmContext - Moved DefaultDataSourceResolver back to polaris-relational-jdbc - Replaced @DefaultBean with @Identifier("polaris") per project convention - Added @Any to Instance<DataSource> injection - Added DataSourceResolver producer in ServiceProducers - Added dataSourceResolver config property to PersistenceConfiguration - Updated all getDatasourceOperations() call sites

Subham-KRLX · 2026-03-13T11:50:47Z

@adutra @dimas-b @flyingImer Thanks for the detailed feedback! I've pushed an update that addresses all comments. StoreType has been removed entirely since JdbcBasePersistenceImpl currently handles both MetaStore and MetricsPersistence — we can reintroduce it once workloads are actually isolated. @DefaultBean has been replaced with @Identifier("polaris") and a producer method in ServiceProducers, following the standard Polaris pattern for pluggable components. DefaultDataSourceResolver now lives in polaris-relational-jdbc alongside its interface, and JdbcMetaStoreManagerFactory injects DataSourceResolver directly. I kept Instance (with @Any) in the default resolver to avoid InactiveBeanException in non-JDBC profiles.

adutra · 2026-03-13T14:59:38Z

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

 */
 @ApplicationScoped
-@DefaultBean
+@Identifier("polaris")


Let's use @Identifier("default") instead.

adutra · 2026-03-13T15:01:47Z

...e/service/src/main/java/org/apache/polaris/service/persistence/PersistenceConfiguration.java

+   * use. Only applicable when using JDBC persistence.
+   */
+  @WithDefault("polaris")
+  String dataSourceResolver();


You don't need this new method, just use the existing type() method above.

Oh sorry - I read too quickly 😅

But to align with other similar cases, it would be good to turn this into polaris.persistence.datasource-resolver.type:

@WithDefault("default") @WithName("polaris.persistence.datasource-resolver.type") String dataSourceResolver();

@adutra @dimas-b I will rename the identifier to "default" and update the config property to polaris.persistence.datasource-resolver.type as suggested. Regarding the broader discussion on module placement — I agree a dedicated polaris-relational-jdbc-runtime module sounds like a clean approach for keeping Quarkus/Agroal dependencies separate. Happy to tackle that as a follow-up or incorporate it here if you'd prefer. Will push the fixes along with CI fixes shortly.

Imho a follow-up is fine, because DefaultDataSourceResolver doesn't need Agroal, so no need for a Quarkus module.

BTW I think my example above is wrong, should be:

@WithDefault("default") @WithName("datasource-resolver.type") String dataSourceResolver();

Would it be possible to isolate JDBC-specific config from the general Persistence config?

I believe NoSQL persistence has some local config classes already.

dataSourceResolver() is only relevant to JDBC, so any Polaris deployments that do not use JDBC, should not be burdened with this config, IMHO.

Upd: I believe it is best to resolve this in current PR. I know it's extra work, but why introduce a config skew only to have to fix it in a follow-up PR right after?

@adutra Got it will use @Identifier("default") and @withname("datasource-resolver.type") with @WithDefault("default"). Pushing the fix along with CI fixes shortly.

+1 to @WithName("datasource-resolver.type")

@adutra @dimas-b Pushed the final updates:

Renamed to @Identifier("default").

Isolated dataSourceResolverType config into RelationalJdbcConfiguration so non-JDBC isn't burdened.

…DBC configuration

adutra · 2026-03-16T10:35:37Z

...rc/main/java/org/apache/polaris/persistence/relational/jdbc/RelationalJdbcConfiguration.java

+  default Optional<String> dataSourceResolverType() {
+    return Optional.of("default");
+  }


nit: if there is a known default, using Optional is generally superfluous.

Also, it would be better to leave this unimplemented here :

String dataSourceResolverType();

But override it in QuarkusRelationalJdbcConfiguration as follows:

@ConfigMapping(prefix = "polaris.persistence.relational.jdbc") public interface QuarkusRelationalJdbcConfiguration extends RelationalJdbcConfiguration { @WithDefault("default") @Override String dataSourceResolverType(); }

I have applied the final dataSourceResolverType / QuarkusRelationalJdbcConfiguration / ServiceProducers changes per review.

dimas-b

Main point from my side in this review round: Moving all CDI code for JDBC to the polaris-relational-jdbc module... WDYT?

dimas-b · 2026-03-16T18:31:03Z

runtime/service/build.gradle.kts

  implementation(project(":polaris-api-catalog-service"))

-  runtimeOnly(project(":polaris-relational-jdbc"))
+  implementation(project(":polaris-relational-jdbc"))


I'd prefer to avoid making this dependency more prominent.

Ideally, only runtime/server should depend on polaris-relational-jdbc, but keep the old runtime-only dep for now is fine too.

I completely agree. I have moved the CDI logic into a new
JdbcCdiProducers class directly within polaris-relational-jdbc. Because ServiceProducers no longer imports from the JDBC module, I’ve reverted the dependency in runtime/service/build.gradle.kts back to runtimeOnly so it doesn't bleed into the compile classpath.

dimas-b · 2026-03-16T18:32:30Z

...rc/main/java/org/apache/polaris/persistence/relational/jdbc/RelationalJdbcConfiguration.java

  Optional<String> databaseType();
+
+  /** The type of the {@link DataSourceResolver} to use. */
+  String dataSourceResolverType();


Does this method have to be here? The JDBC persistence code (configured via this interface) does not actually call it (and does not need to call).

I had to keep it here because of the producer move. Since JdbcCdiProducers is now bundled inside this module, our own internal wiring logic relies on dataSourceResolverType() to determine which bean to retrieve. The low-level connection logic ignores it, but the CDI injection needs it. I've updated the Javadoc to clarify this.

dimas-b · 2026-03-16T18:33:54Z

runtime/service/src/main/java/org/apache/polaris/service/config/ServiceProducers.java

  }

+  @Produces
+  public DataSourceResolver dataSourceResolver(


Should it be @ApplicationScoped? I do not believe it can change in runtime.

I have marked the relocated producer method as @ApplicationScoped in the new JdbcCdiProducers
class.

dimas-b · 2026-03-16T18:36:08Z

runtime/service/src/main/java/org/apache/polaris/service/config/ServiceProducers.java

+      RelationalJdbcConfiguration jdbcConfig,
+      @Any Instance<DataSourceResolver> dataSourceResolvers) {
+    String type = jdbcConfig.dataSourceResolverType();
+    return dataSourceResolvers.select(Identifier.Literal.of(type)).get();


It looks like we're leaning towards moving CDI/Quarkus-related code into the polaris-relational-jdbc module.

Let's move this produce there too. This way the more general-purpose runtime/service code can remain free from JDBC-specific logic and dependencies.

dimas-b · 2026-03-16T18:37:15Z

.../src/main/java/org/apache/polaris/persistence/relational/jdbc/DefaultDataSourceResolver.java

+ */
+@ApplicationScoped
+@DefaultBean
+public class DefaultDataSourceResolver implements DataSourceResolver {


Looking at the evolution of this PR (and this class specifically), I think it's probably fine to move CDI/quarkus-related code into the polaris-relational-jdbc module.

dimas-b · 2026-03-16T18:38:53Z

...n/java/org/apache/polaris/quarkus/common/config/jdbc/QuarkusRelationalJdbcConfiguration.java


 @ConfigMapping(prefix = "polaris.persistence.relational.jdbc")
-public interface QuarkusRelationalJdbcConfiguration extends RelationalJdbcConfiguration {}
+public interface QuarkusRelationalJdbcConfiguration extends RelationalJdbcConfiguration {


As noted in other comment threads, this class should probably go to polaris-relational-jdbc, where the it would be co-located with CDI producers for JDBC stuff.

I left QuarkusRelationalJdbcConfiguration in runtime/common for now. Moving it into polaris-relational-jdbc would require introducing smallrye-config (and Quarkus-specific annotations) as a dependency to the generic JDBC module, which goes against keeping the core persistence layer free of Quarkus dependencies.

flyingImer

Will go another round since last review

Subham-KRLX · 2026-03-18T06:19:06Z

Will go another round since last review

Thanks @flyingImer Heads up before your next pass based on @dimas-b's latest feedback all CDI producer wiring for
DataSourceResolverhas now been moved out of runtime/service and properly encapsulated directly inside the polaris-relational-jdbc module.

singhpk234

I am not sure if I understand the motivation of having realm specific Datasource.

The most simple use case i can think of is having a dedicated datasource that connect to entity table, events table and idempotency table respectively, so each of them have their own connection pool, things like entity tables connections, should keep on working and features like events table and idempotency doesn't effect core functionality of polaris. I recommend we should start from there.

If people wanna split these tables into their individual nodes they can use things like https://docs.citusdata.com/en/stable/get_started/what_is_citus.html
Or
we now support Cocroach too does auto sharding of table to different node.

Subham-KRLX · 2026-03-18T06:36:26Z

I am not sure if I understand the motivation of having realm specific Datasource.

The most simple use case i can think of is having a dedicated datasource that connect to entity table, events table and idempotency table respectively, so each of them have their own connection pool, things like entity tables connections, should keep on working and features like events table and idempotency doesn't effect core functionality of polaris. I recommend we should start from there.

If people wanna split these tables into their individual nodes they can use things like https://docs.citusdata.com/en/stable/get_started/what_is_citus.html Or we now support Cocroach too does auto sharding of table to different node.

@singhpk234 I completely agree that separating connection pools by workload (entities vs events vs idempotency) is the end goal! In fact, my previous iteration of this PR included a StoreType parameter exactly for that. However, @dimas-b and @flyingImer correctly pointed out that since JdbcBasePersistenceImpl currently processes all of these tables together, adding the workload routing logic now is getting ahead of the actual implementation.
To respect that feedback, we narrowed the scope purely to establishing the CDI wiring foundation via RealmContext. Once the actual work to isolate the events and idempotency workloads begins, this SPI is built to be easily expanded (like re-adding StoreType) to support your exact recommendation.

singhpk234 · 2026-03-18T06:41:49Z

Once the actual work to isolate the events and idempotency workloads begins,

I beleive we should start from there then, TBH i dont see much value in having realm specific datasource, if thats some one wants they can just 2 different jvm of polaris 1 only hosting realm 1 another hosting only realm2 and then route between the JVMs in an outer layer. And nevertheless given that we eventually want to have table level connection pool, we should get the isolation in place.

Subham-KRLX · 2026-03-18T06:45:01Z

Once the actual work to isolate the events and idempotency workloads begins,

I beleive we should start from there then, TBH i dont see much value in having realm specific datasource, if thats some one wants they can just 2 different jvm of polaris 1 only hosting realm 1 another hosting only realm2 and then route between the JVMs in an outer layer. And nevertheless given that we eventually want to have table level connection pool, we should get the isolation in place.

@singhpk234 The primary motivation for realm-specific routing is to support scalable multi-tenancy. For a SaaS deployment hosting thousands of tenants (realms), spinning up thousands of separate JVMs isn't feasible.
This DataSourceResolver allows operators to dynamically route different realms to different physical databases (e.g. enterprise vs. free-tier) from a single deployment. This PR establishes that critical foundational hook now, while keeping the door open to add StoreType for workload isolation (entities vs events) as soon as the JDBC layer is updated to support it.

singhpk234 · 2026-03-18T06:51:46Z

For a SaaS deployment hosting thousands of tenants (realms), spinning up thousands of separate JVMs isn't feasible.

Why would you spin up 1000s of seperate JVMs just club realms you wanna go in 1 JVM, lets say you have enterprise tier (realm1,......relam n) all go to you enterprise tier jvm and rest all go to free tier, you can router your request to polaris instance based on your tier/realm. you don't have to create 1 realm from 1 jvm.

Again I am not sure what the foundation is being established here, this is just adding an interface which is very soon goona change when StoreType comes, so just to reiterate I think we should focus on having dedicated connection pool and achieve that isolation.

Subham-KRLX · 2026-03-18T06:56:16Z

For a SaaS deployment hosting thousands of tenants (realms), spinning up thousands of separate JVMs isn't feasible.

Why would you spin up 1000s of seperate JVMs just club realms you wanna go in 1 JVM, lets say you have enterprise tier (realm1,......relam n) all go to you enterprise tier jvm and rest all go to free tier, you can router your request to polaris instance based on your tier/realm. you don't have to create 1 realm from 1 jvm.

Again I am not sure what the foundation is being established here, this is just adding an interface which is very soon goona change when StoreType comes, so just to reiterate I think we should focus on having dedicated connection pool and achieve that isolation.

@singhpk234 Routing at the outer layer tightly couples Polaris JVMs to specific database instances. A DataSourceResolver solves this by keeping compute stateless—allowing operators to migrate high-growth realms to dedicated databases seamlessly in the background.Regarding StoreType for workload isolation I completely agree I actually had it in my previous commits. I removed it because @dimas-b and @flyingImer felt updating JdbcBasePersistenceImpl to actually utilize it was too much scope creep for this specific PR.I am more than happy to add StoreType back in if we agree it belongs in this first iteration.

singhpk234 · 2026-03-18T07:19:39Z

allowing operators to migrate high-growth realms to dedicated databases seamlessly in the background

I don;t think in this scenario without application restart a migration can happen ? of all things i don't think your example of enterprise tier vs free tier fits this case too, one time you are advocating i need to give enterprise tier a dedidacte DS, but making the enterpise and free tier share the same JVM, this seems fundamentally wrong to me, as this is not true isolation. Also migration of user from 1 DS to another is not something we can do at the moment one needs to copy and keep the sources in sync and we don't have tools for that.

Subham-KRLX · 2026-03-18T08:34:25Z

@singhpk234 That's a fair point true isolation requires physically separate JVM clusters to prevent noisy neighbors and migration tooling isn't there yet.The core intent of this PR is just to introduce the extensibility hook (SPI), so Polaris isn't locked to a static single DataSource connection pool like it is today.That being said I agree that workload isolation (entities vs events) via dedicated connection pools is a critical use case. To fully support your vision, would you be comfortable moving forward if I simply re add StoreType to the resolve() signature so it supports BOTH (RealmContext, StoreType)right from the start?

dimas-b · 2026-03-19T00:55:26Z

@Subham-KRLX @singhpk234 @flyingImer :

Regarding StoreType for workload isolation [...]

If you prefer to resolve the workload isolation problem in code first, I'd support that. We could put this PR on hold, isolate MetaStore, Metrics, Idempotency, then resume work on the DataSource resolver.

Going the other way is fine too 🙂 The end result is going to be the same (as least that how it see it).

dimas-b · 2026-03-19T00:59:28Z

Re: DataSource / realm mapping at deployment time - I think this is a concern of the Polaris "owner". I believe Polaris as an OSS project should empower users to make customizations that fit their specific deployment choices. I believe Polaris should remain neutral / flexible and not prescribe any specific deployment strategies.

@Subham-KRLX 's comments related to this resonate with my personal view on this matter.

flyrain · 2026-03-19T01:33:39Z

As I said in the dev mailing list[1], introducing multiple data sources requires a proper design before jumping to a implementation. There are multiple factors to consider. We may gather more feedback given JDBC is widely used by the community. Blindly introduce a data source resolver doesn't make sense to me. I'd suggest to work on a design doc first.

https://lists.apache.org/thread/dvr22fqw45zd49sdw4027xrz4gq2xpbc

Subham-KRLX · 2026-03-19T03:42:14Z

@flyrain, @singhpk234, and @dimas-b. I agree that a formal design document is appropriate and will prepare one that covers both the multi-tenancy requirements and the immediate need for workload-based isolation. To ensure a proper foundation, I am re-adding StoreType (Metastore, Metrics, Events) to the DataSourceResolver
interface now. This will allow the proposal to explicitly address concerns raised on the dev list regarding transactional consistency and the limitations of cross-table joins when splitting data sources. I will post the design doc for review shortly.

adnanhemani · 2026-03-19T07:11:33Z

Agreed here with some of the comments asking what value is bringing realm-based datasource resolution into the codebase - especially with a shared JVM that is controlling the request flow. I am afraid that adding this type of functionality without any metrics or considering alternatives may bring unnecessary complication to the codebase. If a design doc is being written to support this use case, it would be good to include any such data there may exist that prove the need for this feature.

I do think that we are all on the same page that workload-based isolation (different datasource for events, metrics, metadata, etc.) - it does make sense and we do need this feature sooner than later. I would also suggest starting the implementation from that angle instead.

Subham-KRLX · 2026-03-20T14:27:03Z

Thanks @adnanhemani, @flyrain, and @singhpk234. I completely agree that workload-based isolation (Metrics vs Events vs Metastore) is the highest priority. I've updated the PR to re-instate StoreType (METASTORE, METRICS, EVENTS) so the immediate focus is on establishing the foundation for separating connection pools.
I have also published a formal design proposal for review here: #4032"

github-project-automation bot added this to Basic Kanban Board Mar 9, 2026

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Mar 9, 2026

Subham-KRLX marked this pull request as draft March 9, 2026 18:31

Subham-KRLX force-pushed the feat/multi-datasource-support branch from c082f09 to 136e381 Compare March 9, 2026 18:38

dimas-b reviewed Mar 9, 2026

View reviewed changes

Subham-KRLX marked this pull request as ready for review March 9, 2026 19:07

Final review nits on DataSourceResolver PR

e024bea

- Renamed StoreType.MAIN to METASTORE - Use RealmContext instead of String realmId in DataSourceResolver.resolve - Added missing polaris-core dependency to runtime-common - Fixed formatting of @Inject fields for Spotless

dimas-b reviewed Mar 9, 2026

View reviewed changes

Address final nit: use import for RealmContext

96e7ff0

dimas-b previously approved these changes Mar 9, 2026

View reviewed changes

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Mar 9, 2026

flyrain reviewed Mar 9, 2026

View reviewed changes

Final formatting and naming consistency cleanup

9a07e6e

Subham-KRLX dismissed dimas-b’s stale review via 9a07e6e March 10, 2026 03:39

Fix InactiveBeanException in non-JDBC profiles via lazy DataSource re…

75e4096

…solution

dimas-b previously approved these changes Mar 11, 2026

View reviewed changes

flyingImer reviewed Mar 12, 2026

View reviewed changes

Address review comments: tighten scope, use plain injection and @Defa…

fccb254

…ultBean

Subham-KRLX dismissed dimas-b’s stale review via fccb254 March 12, 2026 11:13

dimas-b reviewed Mar 12, 2026

View reviewed changes

adutra reviewed Mar 13, 2026

View reviewed changes

Address final nits: Rename identifier to default and move config to J…

1e00328

…DBC configuration

adutra reviewed Mar 16, 2026

View reviewed changes

Refine DataSourceResolver configuration wiring per review

493c386

dimas-b reviewed Mar 16, 2026

View reviewed changes

flyingImer reviewed Mar 18, 2026

View reviewed changes

Address final CDI producer wiring nits

6311fbe

singhpk234 reviewed Mar 18, 2026

View reviewed changes

Subham-KRLX mentioned this pull request Mar 20, 2026

[Proposal] Multi-DataSource Support via DataSourceResolver #4032

Open


		private static final Logger LOGGER = LoggerFactory.getLogger(DefaultDataSourceResolver.class);

		private final Instance<DataSource> defaultDataSource;


		private final Instance<DataSource> defaultDataSource;

		@Inject

Conversation

Subham-KRLX commented Mar 9, 2026

Checklist

Uh oh!

dimas-b commented Mar 9, 2026

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Subham-KRLX commented Mar 9, 2026

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Subham-KRLX commented Mar 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b commented Mar 10, 2026

Uh oh!

Subham-KRLX commented Mar 11, 2026

Uh oh!

dimas-b commented Mar 11, 2026

Uh oh!

flyrain commented Mar 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Subham-KRLX commented Mar 12, 2026

Uh oh!

dimas-b commented Mar 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment