Skip to content

Comments

Cluster redirect#35

Open
rajesh-meesho wants to merge 24 commits intomasterfrom
cluster-redirect
Open

Cluster redirect#35
rajesh-meesho wants to merge 24 commits intomasterfrom
cluster-redirect

Conversation

@rajesh-meesho
Copy link

@rajesh-meesho rajesh-meesho commented Feb 17, 2026

What is this PR for?

A few sentences describing the overall goals of the pull request's commits.
First time? Check out the contributing guide - https://zeppelin.apache.org/contribution/contributions.html

What type of PR is it?

Bug Fix
Improvement
Feature
Documentation
Hot Fix
Refactoring
Please leave your type of PR only

Todos

  • - Task

What is the Jira issue?

How should this be tested?

  • Strongly recommended: add automated unit tests for any new or changed behavior
  • Outline any manual steps to test the PR here.

Screenshots (if appropriate)

Questions:

  • Does the license files need to update?
  • Is there breaking changes for older versions?
  • Does this needs documentation?

Summary by CodeRabbit

  • New Features

    • Per-statement JDBC URL override at execution time; queries can target an alternate DB URL.
    • Per-call URL override on connection requests.
  • Improvements

    • Validation now includes raw SQL and raw JDBC URL and can return an updated JDBC URL and/or SQL.
    • Connections refresh automatically when the resolved URL changes; override usage is logged.
  • Cleanup

    • Connection pools are tracked per-URL to avoid cross-URL leakage and improve pool management.
  • Bug Fixes

    • WebSocket reconnects only on abnormal closures (non-normal close codes).

@coderabbitai
Copy link

coderabbitai bot commented Feb 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Implements per-statement JDBC URL validation and optional per-call URL override; introduces SHA-256 URL-hashed pool names and per-URL pool management; adds a new getConnection overload; switches ValidationRequest to Gson serialization; extends ValidationResponse with URL fields; tracks registered pools in JDBCUserConfigurations.

Changes

Cohort / File(s) Summary
JDBC Interpreter (validation, URL override, pool naming)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java
Adds getConnection(InterpreterContext, String) and delegates old overload; implements per-statement ValidationRequest/ValidationResponse flow, per-call URL override, mid-execution URL switches (close/reacquire), and URL-hashed pool naming with updated pool create/close logic.
Validation Request (Gson serialization)
jdbc/src/main/java/org/apache/zeppelin/jdbc/ValidationRequest.java
Adds rawQueryText and rawJdbcUrl fields with @SerializedName; constructor updated; toJson() replaced by Gson-based serialization.
Validation Response (URL fields)
jdbc/src/main/java/org/apache/zeppelin/jdbc/ValidationResponse.java
Adds rawJdbcUrl and newJdbcUrl fields with getters/setters; updates fromJson to populate new fields with appropriate defaults.
JDBC User Configurations (pool tracking & ops)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java
Introduces registeredPools Set; adds saveDBDriverPool(PoolingDriver, String), getPoolingDriver(), removePoolName(String), and isConnectionInDBDriverPool(String); updates init/remove behavior to manage registered pools.
Websocket reconnect logic
zeppelin-web-angular/projects/zeppelin-sdk/src/message.ts
Adds check for normal close code (1000) and triggers reconnect only on abnormal WebSocket closure.
Cross-cutting imports / utilities
jdbc/src/main/java/org/apache/zeppelin/jdbc/...
Adds SHA-256 and charset imports and related logging to support URL-hashed pool name generation and fallback handling.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant JDBCInterpreter
participant Validator
participant PoolManager as ConnectionPool/Driver
participant Context as InterpreterContext

Client->>JDBCInterpreter: executeSql(statement, context)
Note right of JDBCInterpreter: build ValidationRequest(queryText, rawQueryText, user, interpreterName, rawJdbcUrl)
JDBCInterpreter->>Validator: validate(ValidationRequest)
alt Validator returns newJdbcUrl or newQueryText
    Validator-->>JDBCInterpreter: ValidationResponse(newJdbcUrl?, newQueryText?)
    Note right of JDBCInterpreter: update targetJdbcUrl and/or SQL
else no change or error
    Validator-->>JDBCInterpreter: no-op or error
end
JDBCInterpreter->>PoolManager: getConnection(context, overrideUrl?)
PoolManager-->>JDBCInterpreter: Connection
JDBCInterpreter->>PoolManager: execute SQL on Connection
PoolManager-->>JDBCInterpreter: results / errors
JDBCInterpreter->>Context: write results / errors
🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Krd Checker ⚠️ Warning PR description lacks KRD link or exemption details; only contains template without KRD-related information. Add KRD link or complete KRD exemption (Pod Type and 20+ character justification) to PR description.
✅ Passed checks (1 passed)
Check name Status Explanation
Dynamic Configuration Validation ✅ Passed No changes to application-dyn-*.yml files detected. All modified files are source code (Java and TypeScript), not dynamic configuration files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java (1)

898-905: ⚠️ Potential issue | 🟠 Major

Fix String comparison to avoid wrong version branch.

Line 902 uses == for String comparison, which checks reference equality and can route to the wrong branch. Use .equals() instead.

🐛 Suggested fix
-              if(response.getVersion() == "v1") {
+              if ("v1".equals(response.getVersion())) {

Additionally, lines 929-932 write internal debug information (context state, configuration maps, properties) directly to user output, which may inadvertently expose sensitive configuration or contextual details.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` around
lines 898 - 905, The version check in the ValidationResponse handling uses
reference equality (response.getVersion() == "v1") which can misroute logic;
change it to use .equals("v1") (or "v1".equals(response.getVersion()) to guard
NPE) inside the code that constructs the response handling after
sendValidationRequest(ValidationRequest). Also remove or sanitize the debug
output that writes internal state/config/properties to user-facing output (the
block that appends context state, configuration maps, and properties to the
finalOutput/StringBuilder) so sensitive configuration is not exposed—either omit
those details from user output or log them at a debug level to an internal
logger instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 953-956: Remove the user-facing writes that dump internal
configuration maps in JDBCInterpreter (the context.out.write calls that print
context, basePropertiesMap, jdbcUserConfigurationsMap, and
getJDBCConfiguration(user).getProperty()), and either delete these lines or
replace them with a minimal, non-sensitive status message; if you must log
configuration metadata, log only redacted keys/counts to a secure logger (not
context.out) and ensure getJDBCConfiguration(user) is never printed verbatim.

---

Outside diff comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 898-905: The version check in the ValidationResponse handling uses
reference equality (response.getVersion() == "v1") which can misroute logic;
change it to use .equals("v1") (or "v1".equals(response.getVersion()) to guard
NPE) inside the code that constructs the response handling after
sendValidationRequest(ValidationRequest). Also remove or sanitize the debug
output that writes internal state/config/properties to user-facing output (the
block that appends context state, configuration maps, and properties to the
finalOutput/StringBuilder) so sensitive configuration is not exposed—either omit
those details from user output or log them at a debug level to an internal
logger instead.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java (2)

591-659: ⚠️ Potential issue | 🟠 Major

Override URL can reuse a pool for the wrong cluster.
Pools are keyed only by user; when overrideUrl differs, getConnectionFromPool can still reuse an existing pool created for the default URL, defeating cluster redirect and risking wrong‑DB reads/writes. Please key pools by URL (or recreate/bypass the pool when the override differs).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` around
lines 591 - 659, getConnection currently pools connections only by user so an
overrideUrl can reuse a pool for a different cluster; compute the effective URL
early (use the URL after
appendProxyUserToURL/appendTagsToURL/validateConnectionUrl), and change
getConnectionFromPool usage to include that effective URL in the pool key (or,
if simpler, bypass pooling when overrideUrl is provided) so pools are segregated
by URL+user; update any pool lookup/creation logic referenced in
getConnectionFromPool to accept and use the connectionUrl (or add an overloaded
getConnectionFromPool(connectionUrl, user, properties)) to ensure overrideUrl
does not reuse a pool for the wrong cluster.

887-997: ⚠️ Potential issue | 🔴 Critical

Add null guards for response and apply newQueryText before splitting.

The validation call (line ~845) is wrapped in a try-catch that logs exceptions without rethrowing, leaving response potentially null. Multiple subsequent dereferences at line 935 (response.isPreSubmitFail()) and throughout the block lack null checks, causing NullPointerException if validation fails.

Additionally, newQueryText is applied inside the per-statement loop (line 997), causing multi-statement inputs to repeatedly execute the same rewritten query instead of the original split statements. This should be applied once before splitting at line 888.

Suggested fix (null-safe + correct split)
-      List<String>  sqlArray = sqlSplitter.splitSql(sql);
+      String effectiveSql = (response != null && StringUtils.isNotBlank(response.getNewQueryText()))
+          ? response.getNewQueryText()
+          : sql;
+      List<String> sqlArray = sqlSplitter.splitSql(effectiveSql);
@@
-            if (response.isPreSubmitFail()) {
+            if (response != null && response.isPreSubmitFail()) {
@@
-            } else {
+            } else if (response != null) {
@@
-              sqlToExecute = response.getNewQueryText() != null ? response.getNewQueryText() : sqlToExecute;
+              // sqlToExecute already derived from effectiveSql
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` around
lines 887 - 997, The response from the validation call can be null and
newQueryText must be applied before splitting; update the code so that
immediately after the validation call you check if response != null and then set
sql = response.getNewQueryText() (or leave sql unchanged if response is null),
then call sqlSplitter.splitSql(sql) (move the newQueryText logic out of the
per-statement loop). Also add null guards before every use of response (e.g.,
response.isPreSubmitFail(), response.getMessage(), response.getVersion(),
response.isFailFast(), response.isFailedByDeprecatedTable()) to avoid NPEs—treat
a null response as "no pre-submit failure" and continue execution or handle it
consistently with cancel(context) as currently used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 611-613: The current LOGGER.info call in JDBCInterpreter that logs
the raw overrideUrl can leak credentials; replace it so the logged value is
redacted or omitted. In the JDBCInterpreter code path where overrideUrl is
checked (the block containing LOGGER.info("Using override URL: {}",
overrideUrl)), either log a sanitized version (mask userinfo between "://" and
"@" and/or mask password query params) or simply log "Using override URL:
[REDACTED]"; implement the redaction helper near this block and use it when
calling LOGGER.info to avoid printing secrets.
- Around line 857-860: The code currently sets targetJdbcUrl from
response.getRawJdbcUrl(); update the override logic to prefer
response.getNewJdbcUrl() and fall back to response.getRawJdbcUrl() if newJdbcUrl
is null/empty: replace the block using response.getRawJdbcUrl() with a check for
response.getNewJdbcUrl() first (e.g., if (response.getNewJdbcUrl() != null &&
!response.getNewJdbcUrl().isEmpty()) { targetJdbcUrl = response.getNewJdbcUrl();
} else if (response.getRawJdbcUrl() != null &&
!response.getRawJdbcUrl().isEmpty()) { targetJdbcUrl = response.getRawJdbcUrl();
}) to ensure validation-service overrides are applied for targetJdbcUrl in
JDBCInterpreter.
- Around line 850-852: The ValidationRequest is being constructed with an
undefined variable userName; replace that argument with the existing variable
user (obtained via getUser(context)) so the call to the ValidationRequest
constructor uses user, sqlToValidate, interpreterName, sql, targetJdbcUrl (i.e.,
update the ValidationRequest(...) invocation to pass user instead of userName).

---

Outside diff comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 591-659: getConnection currently pools connections only by user so
an overrideUrl can reuse a pool for a different cluster; compute the effective
URL early (use the URL after
appendProxyUserToURL/appendTagsToURL/validateConnectionUrl), and change
getConnectionFromPool usage to include that effective URL in the pool key (or,
if simpler, bypass pooling when overrideUrl is provided) so pools are segregated
by URL+user; update any pool lookup/creation logic referenced in
getConnectionFromPool to accept and use the connectionUrl (or add an overloaded
getConnectionFromPool(connectionUrl, user, properties)) to ensure overrideUrl
does not reuse a pool for the wrong cluster.
- Around line 887-997: The response from the validation call can be null and
newQueryText must be applied before splitting; update the code so that
immediately after the validation call you check if response != null and then set
sql = response.getNewQueryText() (or leave sql unchanged if response is null),
then call sqlSplitter.splitSql(sql) (move the newQueryText logic out of the
per-statement loop). Also add null guards before every use of response (e.g.,
response.isPreSubmitFail(), response.getMessage(), response.getVersion(),
response.isFailFast(), response.isFailedByDeprecatedTable()) to avoid NPEs—treat
a null response as "no pre-submit failure" and continue execution or handle it
consistently with cancel(context) as currently used.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java (3)

936-936: ⚠️ Potential issue | 🔴 Critical

String comparison using == instead of .equals().

Line 936 compares strings using ==, which checks reference equality rather than value equality. This will likely always evaluate to false unless the strings are interned, causing the v1 error handling logic to be skipped.

🐛 Proposed fix
-              if(response.getVersion() == "v1") {
+              if("v1".equals(response.getVersion())) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` at line
936, The string comparison in the JDBCInterpreter branch uses reference equality
(response.getVersion() == "v1") so the v1 branch will be skipped; change the
check to a value-safe comparison such as "v1".equals(response.getVersion()) or
Objects.equals(response.getVersion(), "v1") inside the same conditional to
correctly detect the v1 response version (look for the if that calls
response.getVersion()).

376-383: ⚠️ Potential issue | 🟠 Major

Missing connection and read timeouts on validation API call.

The HttpURLConnection is created without setting setConnectTimeout() or setReadTimeout(). If the validation service at FAIL_FAST_VALIDATE_URL is slow or unresponsive, this could cause query execution to block indefinitely.

🛡️ Proposed fix: Add reasonable timeouts
   private static HttpURLConnection createConnection() throws Exception {
     URL url = new URL(FAIL_FAST_VALIDATE_URL);
     HttpURLConnection connection = (HttpURLConnection) url.openConnection();
+    connection.setConnectTimeout(5000);  // 5 seconds
+    connection.setReadTimeout(10000);    // 10 seconds
     connection.setRequestMethod("POST");
     connection.setRequestProperty("Content-Type", "application/json");
     connection.setDoOutput(true); // Enable sending request body
     return connection;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` around
lines 376 - 383, The createConnection() method currently builds an
HttpURLConnection to FAIL_FAST_VALIDATE_URL without timeouts; update
createConnection() to call connection.setConnectTimeout(...) and
connection.setReadTimeout(...) with reasonable values (e.g., a few seconds)
before returning to avoid blocking indefinitely; ensure these calls are made on
the HttpURLConnection instance created in createConnection() so the validation
API call has both connect and read timeouts.

154-154: ⚠️ Potential issue | 🟠 Major

Hardcoded validation URL limits environment flexibility and uses insecure HTTP.

The validation API URL is hardcoded to a production endpoint using HTTP (not HTTPS). This creates two concerns:

  1. Cannot configure different endpoints for dev/staging/prod environments
  2. SQL queries sent for validation are transmitted unencrypted

The codebase already uses property-based configuration for similar settings (e.g., MAX_ROWS_KEY), making this hardcoded URL inconsistent with established patterns. Consider making this configurable via properties and using HTTPS.

🔧 Suggested approach
-  private static final String FAIL_FAST_VALIDATE_URL = "http://spark-event-listener.prd.meesho.int/api/validate";
+  private static final String FAIL_FAST_VALIDATE_URL_KEY = "zeppelin.jdbc.validation.url";
+  private static final String FAIL_FAST_VALIDATE_URL_DEFAULT = "";

   // Then in createConnection() or sendValidationRequest():
+  String validationUrl = getProperty(FAIL_FAST_VALIDATE_URL_KEY, FAIL_FAST_VALIDATE_URL_DEFAULT);
+  if (StringUtils.isBlank(validationUrl)) {
+    return null; // Skip validation if not configured
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` at line
154, FAIL_FAST_VALIDATE_URL is hardcoded to an insecure production HTTP
endpoint; change it to a configurable property (e.g., read from interpreter or
system properties in JDBCInterpreter) and default to a secure HTTPS URL or an
empty/disabled value; update any code that references FAIL_FAST_VALIDATE_URL to
use the new property getter (preserve the symbol name or introduce a new
constant like FAIL_FAST_VALIDATE_URL_KEY) and ensure validation calls use HTTPS
by default and respect environment overrides.
🧹 Nitpick comments (1)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java (1)

861-863: Improve exception logging to include stack trace.

The current format LOGGER.warn("...: {}", e) only logs the exception's toString(). To capture the full stack trace, pass the exception as the last argument without a placeholder.

♻️ Proposed fix
     } catch (Exception e) {
-      LOGGER.warn("Failed to call validation API: {}", e);
+      LOGGER.warn("Failed to call validation API", e);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` around
lines 861 - 863, The catch block in JDBCInterpreter that currently does
LOGGER.warn("Failed to call validation API: {}", e) should be changed to pass
the exception as the throwable argument so the full stack trace is logged (e.g.,
LOGGER.warn("Failed to call validation API", e)); update the catch for Exception
e around the validation API call so LOGGER is invoked with the message and e as
the last parameter to include the stacktrace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 854-863: The code calls sendValidationRequest(...) and catches
exceptions but leaves response null, leading to a possible NPE later when
calling response.isPreSubmitFail(); update the handling so that after the
try/catch you either initialize response to a safe default (e.g., a
ValidationResponse with safe defaults) or add null checks before any use
(notably before response.isPreSubmitFail() in the statement execution loop);
ensure changes reference the sendValidationRequest call site, the response
variable, targetJdbcUrl assignment, and the later response.isPreSubmitFail()
usage so null is handled consistently.

---

Outside diff comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Line 936: The string comparison in the JDBCInterpreter branch uses reference
equality (response.getVersion() == "v1") so the v1 branch will be skipped;
change the check to a value-safe comparison such as
"v1".equals(response.getVersion()) or Objects.equals(response.getVersion(),
"v1") inside the same conditional to correctly detect the v1 response version
(look for the if that calls response.getVersion()).
- Around line 376-383: The createConnection() method currently builds an
HttpURLConnection to FAIL_FAST_VALIDATE_URL without timeouts; update
createConnection() to call connection.setConnectTimeout(...) and
connection.setReadTimeout(...) with reasonable values (e.g., a few seconds)
before returning to avoid blocking indefinitely; ensure these calls are made on
the HttpURLConnection instance created in createConnection() so the validation
API call has both connect and read timeouts.
- Line 154: FAIL_FAST_VALIDATE_URL is hardcoded to an insecure production HTTP
endpoint; change it to a configurable property (e.g., read from interpreter or
system properties in JDBCInterpreter) and default to a secure HTTPS URL or an
empty/disabled value; update any code that references FAIL_FAST_VALIDATE_URL to
use the new property getter (preserve the symbol name or introduce a new
constant like FAIL_FAST_VALIDATE_URL_KEY) and ensure validation calls use HTTPS
by default and respect environment overrides.

---

Nitpick comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 861-863: The catch block in JDBCInterpreter that currently does
LOGGER.warn("Failed to call validation API: {}", e) should be changed to pass
the exception as the throwable argument so the full stack trace is logged (e.g.,
LOGGER.warn("Failed to call validation API", e)); update the catch for Exception
e around the validation API call so LOGGER is invoked with the message and e as
the last parameter to include the stacktrace.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 842-847: Remove user-visible dumps that write SQL, JDBC URLs, and
validation payloads to context.out in JDBCInterpreter (replace the
context.out.write(...) calls that print getInterpreterGroup().getId(), user,
sql, getJDBCConfiguration(user).getProperty().getProperty(URL_KEY), and
ValidationRequest(...).toJson()). Instead, delete those writes or change them to
use LOGGER.debug with redacted or minimal metadata (e.g., interpreter id and a
masked URL or a boolean indicating presence of credentials) so sensitive values
are never written to context.out; apply the same change pattern for similar
calls around lines handling validation/response (also referenced by
ValidationRequest and getJDBCConfiguration).
- Around line 869-884: The try/catch calling sendValidationRequest can leave
response null and later dereferenced; update the logic around
sendValidationRequest/response handling (the response variable of type
ValidationResponse and the code that calls response.toString(),
response.getNewJdbcUrl(), and any later uses such as in the block guarded at or
near where targetJdbcUrl is assigned) to null-check response (or initialize a
default ValidationResponse) before any access; if response is null, write an
appropriate message to context.out and skip/short-circuit the subsequent
response.get... calls (and apply the same null-safety fix to the other
occurrence that reads ValidationResponse later in the class).

@rajesh-meesho rajesh-meesho reopened this Feb 18, 2026
@rajesh-meesho rajesh-meesho marked this pull request as draft February 18, 2026 18:55
@rajesh-meesho rajesh-meesho marked this pull request as ready for review February 18, 2026 18:56
@rajesh-meesho
Copy link
Author

@coderabbitai can you review this PR

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

@rajesh-meesho: I'll review the changes in this PR for you.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai bot added the bugfix label Feb 18, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java (1)

22-27: Remove or delegate the legacy saveDBDriverPool(PoolingDriver) overload to prevent future inconsistencies.

The legacy saveDBDriverPool(PoolingDriver) overload does not register a pool name in registeredPools, while the new overload does. Although all current call sites use the new overload with a pool name, the legacy overload remains in the codebase without deprecation or delegation. This creates a maintenance risk: future code could inadvertently call the legacy overload, leaving registeredPools stale while isConnectionInDBDriverPool(String poolName) would fail. Either deprecate the legacy overload or delegate it to the new one (with null or empty poolName handling).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`
around lines 22 - 27, The legacy overload saveDBDriverPool(PoolingDriver) must
be removed or delegated to the new saveDBDriverPool(PoolingDriver, String) to
avoid leaving registeredPools stale; implement delegation by having
saveDBDriverPool(PoolingDriver) call saveDBDriverPool(poolingDriver, null) (or
an empty string) and ensure the new saveDBDriverPool handles null/empty poolName
consistently by not throwing and by updating registeredPools when a
non-null/non-empty name is provided (and documenting behavior), referencing
saveDBDriverPool(PoolingDriver) and saveDBDriverPool(PoolingDriver, String),
registeredPools, and isConnectionInDBDriverPool(String).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 439-448: getJDBCDriverName currently builds the pool key using the
full (tagged) URL, causing per-paragraph pools because appendTagsToURL injects
paragraph-specific tags; change getJDBCDriverName to derive the pool key from
the base URL (the original JDBC URL before appendTagsToURL/proxy-user tagging)
by sanitizing that base URL for uniqueness, while continuing to use the full
tagged URL for creating actual connections; locate getJDBCDriverName and the
code path that calls appendTagsToURL (referenced around lines where
appendTagsToURL is used) and replace use of the tagged URL in the pool key with
the pre-tag/base URL (or strip the paragraph/proxy tags) so pools are shared
across paragraphs for the same base connection.
- Around line 918-929: The code compares connection.getMetaData().getURL()
(currentUrl) to the raw targetJdbcUrl, but currentUrl may include appended
proxy-user/tags causing false mismatches; update JDBCInterpreter's connection
reuse logic to compare the effective URL produced by the same append logic used
when creating connections (or cache the last-used normalized URL) instead of raw
targetJdbcUrl so you only close when the logical target truly changed; use the
same normalization/append routine used by getConnection(context, targetJdbcUrl)
(or store lastEffectiveUrl alongside connection) and compare
lastEffectiveUrl/currentUrl to decide whether to close and recreate the
connection.
- Around line 487-507: The closeDBPool method currently calls
getJDBCConfiguration(user).removeDBDriverPool() unconditionally which clears all
registered pools and nukes the driver; change it so removeDBDriverPool() is only
invoked when url is null/empty (i.e., when closing all pools for the user). For
the url-specific branch, obtain the poolingDriver without removing it from
JDBCConfiguration and compute the poolName (DEFAULT_KEY + user + "_" +
url.replaceAll(...)), call poolingDriver.closePool(poolName), and remove only
that pool entry from the JDBCConfiguration's registeredPools (introduce or use a
per-pool unregister method on JDBCConfiguration such as unregisterPool(poolName)
or removeRegisteredPool(poolName)) so other pools remain tracked and reusable;
keep logging (LOGGER.info) as before.

---

Duplicate comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 874-905: The ValidationResponse response returned from
sendValidationRequest(sqlToValidate, user, interpreterName, sqlToExecute,
targetJdbcUrl) can be null on failure; guard uses of response (e.g.,
response.isPreSubmitFail(), response.getErrorMessage()) by checking for null
immediately after the call and treating a null response as a validation failure
(log the error via LOGGER, set a safe default ValidationResponse or
short-circuit/skip execution of that sqlToExecute), so update the code around
the sendValidationRequest(...) call in JDBCInterpreter (variable response) to
null-check response before any method calls and handle the null path
consistently (log, mark as pre-submit fail or continue to next statement).

---

Nitpick comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`:
- Around line 22-27: The legacy overload saveDBDriverPool(PoolingDriver) must be
removed or delegated to the new saveDBDriverPool(PoolingDriver, String) to avoid
leaving registeredPools stale; implement delegation by having
saveDBDriverPool(PoolingDriver) call saveDBDriverPool(poolingDriver, null) (or
an empty string) and ensure the new saveDBDriverPool handles null/empty poolName
consistently by not throwing and by updating registeredPools when a
non-null/non-empty name is provided (and documenting behavior), referencing
saveDBDriverPool(PoolingDriver) and saveDBDriverPool(PoolingDriver, String),
registeredPools, and isConnectionInDBDriverPool(String).

Comment on lines 918 to 929
if (connection != null && !connection.isClosed()) {
String currentUrl = connection.getMetaData().getURL();
if (targetJdbcUrl != null && !currentUrl.equals(targetJdbcUrl)) {
LOGGER.info("URL changed, closing old connection");
connection.close();
connection = null;
}
}

if (connection == null || connection.isClosed()) {
connection = getConnection(context, targetJdbcUrl);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

URL change detection compares raw URL vs effective URL.

currentUrl (from metadata) includes proxy user and tags, while targetJdbcUrl is the raw URL. This makes the comparison fail even when the logical target is unchanged, forcing a close/reopen each statement (Line 918-921). Compare against the effective URL built with the same append logic (or cache the last used connection URL).

✅ Suggested fix
- if (targetJdbcUrl != null && !currentUrl.equals(targetJdbcUrl)) {
+ String effectiveUrl = appendTagsToURL(appendProxyUserToURL(targetJdbcUrl, user), context);
+ if (effectiveUrl != null && !currentUrl.equals(effectiveUrl)) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` around
lines 918 - 929, The code compares connection.getMetaData().getURL()
(currentUrl) to the raw targetJdbcUrl, but currentUrl may include appended
proxy-user/tags causing false mismatches; update JDBCInterpreter's connection
reuse logic to compare the effective URL produced by the same append logic used
when creating connections (or cache the last-used normalized URL) instead of raw
targetJdbcUrl so you only close when the logical target truly changed; use the
same normalization/append routine used by getConnection(context, targetJdbcUrl)
(or store lastEffectiveUrl alongside connection) and compare
lastEffectiveUrl/currentUrl to decide whether to close and recreate the
connection.

@rajesh-meesho
Copy link
Author

@coderabbitai can you review this PR

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

@rajesh-meesho: I'll review the changes in this PR for you.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@semgrep-sast-meesho
Copy link

Semgrep found 1 formatted-sql-string finding:

  • jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java

Detected a formatted string in a SQL statement. This could lead to SQL injection if variables in the SQL statement are not properly sanitized. Use a prepared statements (java.sql.PreparedStatement) instead. You can obtain a PreparedStatement using 'connection.prepareStatement'.

View Dataflow Graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L451 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 451] url</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L456 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 456] urlHash</a>"]

            v3["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L454 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 454] hex</a>"]

            v4["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L451 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 451] hash</a>"]

            v5["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L466 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 466] buildPoolName</a>"]

            v6["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L645 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 645] getJDBCDriverName</a>"]

            v7["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L645 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 645] jdbcDriver</a>"]

            v8["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L707 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 707] getConnectionFromPool</a>"]

            v9["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L707 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 707] connection</a>"]

            v10["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L982 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 982] getConnection</a>"]

            v11["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L982 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 982] connection</a>"]

            v12["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L1005 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1005] statement</a>"]
        end
            v2 --> v3
            v3 --> v4
            v4 --> v5
            v5 --> v6
            v6 --> v7
            v7 --> v8
            v8 --> v9
            v9 --> v10
            v10 --> v11
            v11 --> v12
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/Meesho/zeppelin/blob/ab430188e8d1a545d152877470129bb2b6ed71ae/jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java#L1026 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 1026] statement.execute(statementPrecode)</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink

Loading

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java (1)

905-1102: ⚠️ Potential issue | 🔴 Critical

Fix potential NPE when validation API fails.

If the validation call throws, response remains null, but it’s later dereferenced (e.g., response.isPreSubmitFail()), which will throw and mask the original failure.

🐛 Suggested guard for null response
-        ValidationResponse response = null;
+        ValidationResponse response = null;
@@
-          try {
-            if (response.isPreSubmitFail()) {
+          try {
+            if (response != null && response.isPreSubmitFail()) {
               if(response.getVersion() == "v1") {
@@
-            } else {
+            } else if (response != null) {
               // pre_submit_fail is false - show message as suggestion if present, but continue query execution
               String message = response.getMessage();
               if (message != null && !message.isEmpty()) {
                 context.out.write("%text " + message + "\n\n");
                 context.out.flush();
               }
               sqlToExecute = response.getNewQueryText() != null ? response.getNewQueryText() : sqlToExecute;
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java` around
lines 905 - 1102, executeSql currently dereferences response
(ValidationResponse) after sendValidationRequest, which can be null if the
validation call threw; add a null-check after the try/catch that calls
sendValidationRequest and handle the null case explicitly (e.g., log a warn and
treat as "no validation response" so execution continues) before any uses of
response (references include response.getNewJdbcUrl(),
response.isPreSubmitFail(), response.getMessage(), response.getVersion(),
response.isFailFast(), response.isFailedByDeprecatedTable(),
response.getErrorHeader(), response.getNewQueryText()); ensure any logic that
depends on response short-circuits or uses safe defaults when response == null
so no NPE occurs.
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java (1)

91-127: ⚠️ Potential issue | 🟠 Major

Replace HashSet with ConcurrentHashMap.newKeySet() for thread-safe pool tracking.

registeredPools is a plain HashSet that is mutated and read without synchronization. With concurrent execution enabled, multiple threads can access the same JDBCUserConfigurations instance for a given user, causing data races and corrupted set state.

🔧 Recommended fix
-import java.util.HashSet;
 import java.util.Map;
 import java.util.Properties;
 import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
@@
-    registeredPools = new HashSet<>();
+    registeredPools = ConcurrentHashMap.newKeySet();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`
around lines 91 - 127, Change the registeredPools field from a plain HashSet to
a concurrent set created via ConcurrentHashMap.newKeySet() and initialize it
accordingly so all accesses from methods like saveDBDriverPool, removePoolName,
removeDBDriverPool, and isConnectionInDBDriverPool(String) are thread-safe;
update the field declaration (registeredPools) to use the concurrent set type,
keep existing method logic (add/remove/contains/clear) but remove any
assumptions about external synchronization, and add the necessary import for
java.util.concurrent.ConcurrentHashMap.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 905-1102: executeSql currently dereferences response
(ValidationResponse) after sendValidationRequest, which can be null if the
validation call threw; add a null-check after the try/catch that calls
sendValidationRequest and handle the null case explicitly (e.g., log a warn and
treat as "no validation response" so execution continues) before any uses of
response (references include response.getNewJdbcUrl(),
response.isPreSubmitFail(), response.getMessage(), response.getVersion(),
response.isFailFast(), response.isFailedByDeprecatedTable(),
response.getErrorHeader(), response.getNewQueryText()); ensure any logic that
depends on response short-circuits or uses safe defaults when response == null
so no NPE occurs.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`:
- Around line 91-127: Change the registeredPools field from a plain HashSet to a
concurrent set created via ConcurrentHashMap.newKeySet() and initialize it
accordingly so all accesses from methods like saveDBDriverPool, removePoolName,
removeDBDriverPool, and isConnectionInDBDriverPool(String) are thread-safe;
update the field declaration (registeredPools) to use the concurrent set type,
keep existing method logic (add/remove/contains/clear) but remove any
assumptions about external synchronization, and add the necessary import for
java.util.concurrent.ConcurrentHashMap.

---

Duplicate comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCInterpreter.java`:
- Around line 629-648: getConnectionFromPool/buildPoolName currently derive the
pool key from a tagged URL (appendTagsToURL), causing per‑paragraph pool
explosion; change the pool key derivation to use the original untagged JDBC URL
(or strip paragraph tags before hashing) so pools are reused across paragraphs.
Locate usages in getConnectionFromPool and createConnectionPool (and any call
sites that pass appendTagsToURL into buildPoolName) and replace the argument
with the base URL or implement a helper to remove tags before calling
buildPoolName; keep all other behavior (registerPool, saveDBDriverPool)
unchanged.

@rajesh-meesho
Copy link
Author

@coderabbitai can you review this PR

@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

@rajesh-meesho, I'll review the changes in this PR for you.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java (2)

33-34: poolingDriver is unsynchronized while registeredPools is a concurrent collection — mixed concurrency guarantees

registeredPools uses ConcurrentHashMap.newKeySet() (thread-safe), but poolingDriver and isSuccessful are plain, non-volatile fields with no synchronization. The compound check in isConnectionInDBDriverPool(String) (line 126) — poolingDriver != null && registeredPools.contains(poolName) — is subject to a TOCTOU hazard: poolingDriver could be nulled by another thread between the two reads. Using volatile on poolingDriver (and isSuccessful) would at minimum guarantee visibility, though atomic compound operations would still require external synchronization.

Also applies to: 40-40, 91-127

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`
around lines 33 - 34, The field visibility/synchronization is inconsistent: mark
the JDBCUserConfigurations.poolingDriver and JDBCUserConfigurations.isSuccessful
fields as volatile to ensure visibility across threads, and protect the compound
check in isConnectionInDBDriverPool(String) (which reads poolingDriver != null
&& registeredPools.contains(poolName)) by either synchronizing that method/block
or using a local volatile read (e.g., capture poolingDriver into a local
variable) before checking registeredPools so the two reads are consistent;
update any other accesses that rely on atomic visibility to use the volatile
fields or explicit synchronization to avoid TOCTOU races.

34-34: Inconsistent API contract: old single-arg saveDBDriverPool bypasses registeredPools

The original saveDBDriverPool(PoolingDriver driver) (lines 86–89) does not add anything to registeredPools. Any caller that still uses the old overload will find that isConnectionInDBDriverPool(String poolName) (line 125) always returns false, even when the driver is active — silently breaking the pool-name lookup. Consider either deprecating the old overload or having it delegate to the new one with a sentinel/default pool name to keep the two APIs consistent.

Also applies to: 40-40, 86-95, 121-127

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java` at
line 34, The old overload saveDBDriverPool(PoolingDriver driver) currently
bypasses registeredPools causing isConnectionInDBDriverPool(String poolName) to
always return false for drivers added via that method; update the single-arg
saveDBDriverPool to either be deprecated and call the two-arg
saveDBDriverPool(String poolName, PoolingDriver driver) with a clear
sentinel/default pool name or to extract the pool name and add it to
registeredPools before registering the driver, and ensure registeredPools
contains the poolName used by isConnectionInDBDriverPool; update the
implementation of saveDBDriverPool(PoolingDriver driver) and/or registeredPools
manipulation so both overloads maintain the same contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`:
- Around line 125-127: The method isConnectionInDBDriverPool may throw NPE
because registeredPools.contains(poolName) fails for null keys; update
isConnectionInDBDriverPool to first null-guard the poolName (return false if
poolName == null) before referencing registeredPools or poolingDriver,
preserving the existing poolingDriver != null check and then call
registeredPools.contains(poolName).
- Around line 109-111: The removePoolName method can pass a null key to the
ConcurrentHashMap registeredPools which throws a NullPointerException; update
the JDBCUserConfigurations.removePoolName method to guard against null by
checking if poolName is null and returning early (no-op) before calling
this.registeredPools.remove(poolName), ensuring no change to concurrency
semantics.
- Around line 91-95: saveDBDriverPool currently adds poolName into
registeredPools without checking for null and unconditionally overwrites
poolingDriver; first add a null-guard for the poolName before calling
this.registeredPools.add(poolName) (e.g. skip/add a defensive check or throw
IllegalArgumentException) to avoid NPE on ConcurrentHashMap-backed set, and
second, prevent silent replacement of poolingDriver by only assigning
this.poolingDriver = driver when this.poolingDriver is null (or verify driver ==
this.poolingDriver and otherwise throw/log) so existing registrations remain
bound to the original driver; leave isSuccessful handling unchanged.

In `@zeppelin-web-angular/projects/zeppelin-sdk/src/message.ts`:
- Around line 65-67: The close$ subscriber and the retryWhen in connect() create
two competing reconnect paths (close$ calling this.connect() immediately and
retryWhen retrying the old WebSocketSubject after 4000ms), causing duplicate
sockets and no backoff; to fix, remove the immediate reconnect from the close$
handler and rely on the existing retryWhen logic (or alternatively remove
retryWhen and implement a delayed reconnect inside close$), e.g., keep
connect(), retryWhen, normalCloseCode, and this.ws intact and delete the
console.log/this.connect() call inside the close$ subscriber so only the
retryWhen-based backoff triggers reconnects.
- Around line 60-68: The subscriber on this.close$ accesses event.code without
guarding for undefined (this.close$.next() is called with no arg), causing a
crash; update the this.close$.subscribe handler to first check that event is
defined and has a numeric code before reading event.code (e.g., if (!event ||
typeof event.code !== 'number') { return /* or route to standard reconnect flow
*/ }). Also remove the immediate this.connect() call in that subscriber to avoid
a dual reconnect path and zero-delay storms; instead route non-normal closes
into the existing retry/backoff mechanism used by the retryWhen logic
(referencing this.close$, normalCloseCode, retryWhen) so reconnections respect
the existing 4000ms/backoff behavior.

---

Nitpick comments:
In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`:
- Around line 33-34: The field visibility/synchronization is inconsistent: mark
the JDBCUserConfigurations.poolingDriver and JDBCUserConfigurations.isSuccessful
fields as volatile to ensure visibility across threads, and protect the compound
check in isConnectionInDBDriverPool(String) (which reads poolingDriver != null
&& registeredPools.contains(poolName)) by either synchronizing that method/block
or using a local volatile read (e.g., capture poolingDriver into a local
variable) before checking registeredPools so the two reads are consistent;
update any other accesses that rely on atomic visibility to use the volatile
fields or explicit synchronization to avoid TOCTOU races.
- Line 34: The old overload saveDBDriverPool(PoolingDriver driver) currently
bypasses registeredPools causing isConnectionInDBDriverPool(String poolName) to
always return false for drivers added via that method; update the single-arg
saveDBDriverPool to either be deprecated and call the two-arg
saveDBDriverPool(String poolName, PoolingDriver driver) with a clear
sentinel/default pool name or to extract the pool name and add it to
registeredPools before registering the driver, and ensure registeredPools
contains the poolName used by isConnectionInDBDriverPool; update the
implementation of saveDBDriverPool(PoolingDriver driver) and/or registeredPools
manipulation so both overloads maintain the same contract.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab43018 and 9ffc83b.

📒 Files selected for processing (2)
  • jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java
  • zeppelin-web-angular/projects/zeppelin-sdk/src/message.ts

Comment on lines +91 to +95
public void saveDBDriverPool(PoolingDriver driver, String poolName) throws SQLException {
this.poolingDriver = driver;
this.registeredPools.add(poolName);
this.isSuccessful = false;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

NullPointerException if poolName is null; plus silent PoolingDriver overwrite on repeated calls

Two issues here:

  1. NPE on null poolName: ConcurrentHashMap.newKeySet() is backed by a ConcurrentHashMap, which does not allow null to be used as a key. Calling this.registeredPools.add(null) will throw NullPointerException at runtime. Add a null-guard before the add call.

  2. Silent PoolingDriver overwrite: Each call unconditionally replaces this.poolingDriver. If this overload is ever invoked with a different PoolingDriver instance than the one already stored, all previously registered pool names now point to the wrong (stale) driver. In DBCP2, PoolingDriver is typically a singleton, but there is nothing in this API preventing misuse. Consider guarding with a check like if (this.poolingDriver == null) or at minimum document the assumption that all calls must use the same driver instance.

🛡️ Proposed defensive fix
  public void saveDBDriverPool(PoolingDriver driver, String poolName) throws SQLException {
+   if (poolName == null) {
+     throw new IllegalArgumentException("poolName must not be null");
+   }
+   if (this.poolingDriver != null && this.poolingDriver != driver) {
+     throw new IllegalStateException("Cannot register pool '" + poolName + "' against a different PoolingDriver instance");
+   }
    this.poolingDriver = driver;
    this.registeredPools.add(poolName);
    this.isSuccessful = false;
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`
around lines 91 - 95, saveDBDriverPool currently adds poolName into
registeredPools without checking for null and unconditionally overwrites
poolingDriver; first add a null-guard for the poolName before calling
this.registeredPools.add(poolName) (e.g. skip/add a defensive check or throw
IllegalArgumentException) to avoid NPE on ConcurrentHashMap-backed set, and
second, prevent silent replacement of poolingDriver by only assigning
this.poolingDriver = driver when this.poolingDriver is null (or verify driver ==
this.poolingDriver and otherwise throw/log) so existing registrations remain
bound to the original driver; leave isSuccessful handling unchanged.

Comment on lines +109 to +111
public void removePoolName(String poolName) {
this.registeredPools.remove(poolName);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

NullPointerException if poolName is null

ConcurrentHashMap does not allow null keys, so this.registeredPools.remove(null) will throw NullPointerException. Add a null-guard.

🛡️ Proposed fix
  public void removePoolName(String poolName) {
+   if (poolName == null) return;
    this.registeredPools.remove(poolName);
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public void removePoolName(String poolName) {
this.registeredPools.remove(poolName);
}
public void removePoolName(String poolName) {
if (poolName == null) return;
this.registeredPools.remove(poolName);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`
around lines 109 - 111, The removePoolName method can pass a null key to the
ConcurrentHashMap registeredPools which throws a NullPointerException; update
the JDBCUserConfigurations.removePoolName method to guard against null by
checking if poolName is null and returning early (no-op) before calling
this.registeredPools.remove(poolName), ensuring no change to concurrency
semantics.

Comment on lines +125 to +127
public boolean isConnectionInDBDriverPool(String poolName) {
return this.poolingDriver != null && this.registeredPools.contains(poolName);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

NullPointerException if poolName is null

ConcurrentHashMap does not allow null to be used as a key. this.registeredPools.contains(null) will throw NullPointerException. Add a null-guard.

🛡️ Proposed fix
  public boolean isConnectionInDBDriverPool(String poolName) {
+   if (poolName == null) return false;
    return this.poolingDriver != null && this.registeredPools.contains(poolName);
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@jdbc/src/main/java/org/apache/zeppelin/jdbc/JDBCUserConfigurations.java`
around lines 125 - 127, The method isConnectionInDBDriverPool may throw NPE
because registeredPools.contains(poolName) fails for null keys; update
isConnectionInDBDriverPool to first null-guard the poolName (return false if
poolName == null) before referencing registeredPools or poolingDriver,
preserving the existing poolingDriver != null check and then call
registeredPools.contains(poolName).

Comment on lines +60 to +68
this.close$.subscribe(event => {
this.connectedStatus = false;
this.connectedStatus$.next(this.connectedStatus);
this.pingIntervalSubscription.unsubscribe();

if (event.code !== this.normalCloseCode) {
console.log('WebSocket closed unexpectedly. Reconnecting...');
this.connect();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find the message.ts file
find . -name "message.ts" -path "*zeppelin-sdk*" | head -5

Repository: Meesho/zeppelin

Length of output: 118


🏁 Script executed:

#!/bin/bash
# Check RxJS version in dependencies
find . -name "package.json" | xargs grep -l "rxjs" | head -3

Repository: Meesho/zeppelin

Length of output: 155


🏁 Script executed:

#!/bin/bash
# Look for the message.ts file and examine the close() method and close$ subject
fd "message.ts" -x cat -n {} \;

Repository: Meesho/zeppelin

Length of output: 20015


🏁 Script executed:

#!/bin/bash
# Check RxJS version in package.json
cat zeppelin-web-angular/package.json | grep -A 2 -B 2 "rxjs"

Repository: Meesho/zeppelin

Length of output: 190


🌐 Web query:

RxJS 6.5.3 Subject.next() optional parameter undefined behavior

💡 Result:

In RxJS 6.5.3, Subject.next is typed to allow being called with no argument (next(value?: T)), so subject.next() is valid TypeScript even for Subject<number>, etc. (stackoverflow.com)

At runtime, though, there’s no special “empty next” behavior: calling a JS function with no arguments means the parameter value is undefined, and Subject.next simply forwards that value to observers—so your subscribers will receive undefined as an emission. (You can see next(value) just iterates observers and calls observer.next(value).) (app.unpkg.com)

Practical implications / what to do:

  • If you want a “signal only” subject, use new Subject<void>() and call next() (the undefined emission is fine/expected). (stackoverflow.com)
  • If you don’t want undefined to ever be emitted, don’t call next() without a value (or filter it out downstream).
  • This mismatch was tightened in RxJS 7: Subject<T>.next effectively requires an argument unless T is void, so upgrades often surface “Expected 1 arguments, but got 0”. (stackoverflow.com)

Citations:


Guard the event parameter against undefined — calling close() without arguments will crash.

The close() method at line 128 calls this.close$.next() with no argument. In RxJS 6.5.3, Subject<T>.next(value?: T) accepts an optional parameter, so this compiles but emits undefined at runtime. The subscriber immediately accesses event.code on line 65, causing a TypeError: Cannot read properties of undefined (reading 'code') whenever close() is invoked manually (e.g., from destroy() or external callers).

Proposed fix — guard against undefined event
     this.close$.subscribe(event => {
       this.connectedStatus = false;
       this.connectedStatus$.next(this.connectedStatus);
       this.pingIntervalSubscription.unsubscribe();
 
-      if (event.code !== this.normalCloseCode) {
+      if (event && event.code !== this.normalCloseCode) {
         console.log('WebSocket closed unexpectedly. Reconnecting...');
         this.connect();
       }
     });

Additionally, there are two architectural issues created by this change:

  1. Dual reconnect paths: The existing retryWhen (lines 106–115) waits 4000ms before retrying, but the new close$ subscriber immediately calls connect() for non-1000 codes, creating two independent reconnect loops.
  2. No backoff: The new path has zero delay; if the server is down, each failed connection triggers an immediate reconnect attempt, risking a connection storm.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zeppelin-web-angular/projects/zeppelin-sdk/src/message.ts` around lines 60 -
68, The subscriber on this.close$ accesses event.code without guarding for
undefined (this.close$.next() is called with no arg), causing a crash; update
the this.close$.subscribe handler to first check that event is defined and has a
numeric code before reading event.code (e.g., if (!event || typeof event.code
!== 'number') { return /* or route to standard reconnect flow */ }). Also remove
the immediate this.connect() call in that subscriber to avoid a dual reconnect
path and zero-delay storms; instead route non-normal closes into the existing
retry/backoff mechanism used by the retryWhen logic (referencing this.close$,
normalCloseCode, retryWhen) so reconnections respect the existing 4000ms/backoff
behavior.

Comment on lines +65 to +67
if (event.code !== this.normalCloseCode) {
console.log('WebSocket closed unexpectedly. Reconnecting...');
this.connect();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Dual reconnect paths conflict and the new path has no backoff.

The existing connect() (lines 103–120) already sets up a retryWhen that handles abnormal closes: it waits for close$ plus a 4 000 ms delay before re-subscribing to this.ws. The new close$ subscriber introduces a second, independent reconnect path that calls connect() immediately, replacing this.ws with a brand-new WebSocketSubject. For every abnormal close, both paths now fire simultaneously:

  1. The new handler replaces this.ws immediately with no delay.
  2. The old retryWhen on the now-orphaned WebSocketSubject still listens on the same close$ and re-subscribes to the stale reference after 4 s.

This creates two competing WebSocket instances. Additionally, because there is no delay before calling connect(), repeated connection failures produce a tight reconnect loop with no back-off — the original retryWhen mechanism provided 4 000 ms between attempts.

Consider one of:

  • Remove retryWhen and rely solely on the close$ handler, adding an explicit delay (e.g. setTimeout(() => this.connect(), 4000)).
  • Remove the close$ handler's connect() call and let the retryWhen handle abnormal closes, adjusting it to not depend on another close$ emission if needed.
💡 Suggested approach: remove retryWhen, keep close$ handler with delay
     this.ws
       .pipe(
-        retryWhen(errors =>
-          errors.pipe(
-            mergeMap(() =>
-              this.close$.pipe(
-                take(1),
-                delay(4000)
-              )
-            )
-          )
-        )
+        // Reconnection is handled by the close$ subscriber below
       )
       .subscribe((e: WebSocketMessage<keyof MessageReceiveDataTypeMap>) => {
         console.log('Receive:', e);
         this.received$.next(this.interceptReceived(e));
       });
-      if (event.code !== this.normalCloseCode) {
-        console.log('WebSocket closed unexpectedly. Reconnecting...');
-        this.connect();
-      }
+      if (event && event.code !== this.normalCloseCode) {
+        console.log('WebSocket closed unexpectedly. Reconnecting in 4s...');
+        setTimeout(() => this.connect(), 4000);
+      }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zeppelin-web-angular/projects/zeppelin-sdk/src/message.ts` around lines 65 -
67, The close$ subscriber and the retryWhen in connect() create two competing
reconnect paths (close$ calling this.connect() immediately and retryWhen
retrying the old WebSocketSubject after 4000ms), causing duplicate sockets and
no backoff; to fix, remove the immediate reconnect from the close$ handler and
rely on the existing retryWhen logic (or alternatively remove retryWhen and
implement a delayed reconnect inside close$), e.g., keep connect(), retryWhen,
normalCloseCode, and this.ws intact and delete the console.log/this.connect()
call inside the close$ subscriber so only the retryWhen-based backoff triggers
reconnects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant