Query Databricks SQL Warehouses directly from the Polarity overlay. Executes an admin-configured SQL statement with {{entity}} substitution for each searched indicator and surfaces results as structured key-value rows with Fields, Table, and JSON tab views.
- Universal entity support — IPv4, IPv6, MD5, SHA1, SHA256, email, domain, CVE, URL, MAC, and admin-defined custom types
- Three tab views per result row — Fields (humanized key/value), Table (horizontal scroll), JSON (syntax-highlighted)
- Summary bar tags — configurable column values surfaced in the collapsed Polarity bar, with optional field-name prefix
- Bottleneck rate limiter — configurable
maxConcurrentandminTimeto stay within Databricks per-warehouse limits - Timeout safety — server-side
wait_timeouton every statement; timed-out queries show a user-visible warning instead of a hard error - Polarity-assistant AI reducer —
reducers/details.jsonpipeline prunes and forwards row data to the AI assistant
| Requirement | Notes |
|---|---|
| Polarity server | v5.x or later |
| Node.js | ≥ 18.0.0 (OpenSSL 3.x) |
| Databricks | AWS, Azure, or GCP workspace |
| SQL Warehouse | Classic or Serverless warehouse — must be running |
| PAT token | User-level or service-principal Personal Access Token |
Upload the .tgz package via the Polarity admin UI under Integrations → Upload Integration.
cd ~/Documents/Polarity/databricks
npm installAll options are admin-only unless noted.
| Option | Type | Default | Notes |
|---|---|---|---|
| Databricks Workspace Host | text | — | Full workspace URL, e.g. https://adb-1234567890123456.7.azuredatabricks.net. No trailing slash. |
| Personal Access Token (PAT) | password | — | Created in Databricks UI → User Settings → Developer → Access tokens. |
| SQL Warehouse ID | text | — | Found in SQL Warehouse settings → Connection Details. |
| SQL Query | text | — | Must contain {{entity}}. Example: SELECT * FROM logs.events WHERE ip = '{{entity}}' LIMIT 25 |
| Query Wait Timeout | number | 50 |
Seconds (5–50). Queries exceeding this show a timeout warning. |
| Summary Fields | text | — | Comma-separated column names to surface as overlay summary tags, e.g. user_name,severity. Leave blank to show a Results: N count badge. |
| Include Field Name in Summary | boolean | true |
When enabled, tags display as field: value; when disabled, only the value is shown. (Users can view, admin sets) |
| Maximum Summary Tags | number | 4 |
Max tags before showing +N more. Set to 0 to show all. |
| Max Concurrent Queries | number | 5 |
Requires integration restart after change. |
| Min Time Between Queries (ms) | number | 200 |
Rate-limit throttle. Requires integration restart after change. |
Use {{entity}} (case-insensitive) as the parameter placeholder. The integration will replace it with the SQL-escaped indicator value before execution.
-- Good: single entity lookup with LIMIT
SELECT event_time, src_ip, user, action
FROM catalog.schema.access_logs
WHERE src_ip = '{{entity}}'
ORDER BY event_time DESC
LIMIT 50Configure Summary Fields to the column names you want surfaced in the overlay bar:
user_name,action,severity
The Databricks Statement Execution API used by this integration does not support parameterized bindings for all literal types in all SQL dialects. The {{entity}} value is SQL-escaped (single quotes are doubled) before interpolation. For maximum safety, wrap entity values in quotes in your SQL template:
WHERE ip_address = '{{entity}}'| API State | Integration behaviour |
|---|---|
SUCCEEDED |
Rows surfaced in overlay with Fields / Table / JSON tabs |
CANCELED |
Timeout warning shown; no hard error |
FAILED |
Logged at WARN level; overlay suppressed (no result) |
PENDING / RUNNING |
Logged at WARN; overlay suppressed |
| HTTP 429 | Logged + error returned; Bottleneck limiter protects against bursts |
| HTTP 401/403 | Auth error surfaced in validateOptions |
Run the included package:local script to produce a .tgz free of symlinks (avoids the Polarity extractor unsafe_symlink error):
cd ~/Documents/Polarity/databricks
npm run package:local
# Output: ../databricks-1.0.0.tgzVerify no symlinks remain:
tar -tvzf ../databricks-1.0.0.tgz | grep "^l" && echo "❌ HAS SYMLINKS" || echo "✅ clean"| Change | Version bump |
|---|---|
| Bug fix / CSS / data mapping | 1.0.x patch |
| New entity type / new API endpoint | 1.x.0 minor |
| Breaking config or API change | x.0.0 major |
Update the version in package.json in the same commit as the fix.
Overlay shows "No results found" but I expect data
- Confirm the SQL query contains
{{entity}}and returns rows when run manually in the Databricks SQL editor - Check that the warehouse is running (IDLE/RUNNING state in Databricks UI)
- Review integration logs in Polarity for
FAILEDstate messages
Overlay shows "Query timed out"
- Increase the Query Wait Timeout option (up to 50 s)
- Or optimize the query with appropriate indexes/partitions
validateOptions shows "Authentication failed (HTTP 401)"
- The PAT may have expired; generate a new one in Databricks → User Settings → Access tokens
Overlay is blank / component fails to render
- If upgrading from an older version, restart the Polarity platform to clear Ember's component cache
INT-1750 — New Integration: Databricks SQL — Admin-configurable SIEM-style query engine
Branch: danbi/int-1750-new-integration-databricks-sql-admin-configurable-siem-style
UNLICENSED — Polarity internal integration. Do not distribute.