Skip to content

Feat: Add starrocks catalog support#8856

Open
chris-celerdata wants to merge 7 commits intomarimo-team:mainfrom
chris-celerdata:add-starrocks
Open

Feat: Add starrocks catalog support#8856
chris-celerdata wants to merge 7 commits intomarimo-team:mainfrom
chris-celerdata:add-starrocks

Conversation

@chris-celerdata
Copy link
Copy Markdown

📝 Summary

Implements #8765.

🔍 Description of Changes

This PR introduces a new engine which extends SQLAlchemyEngine. When possible (e.g. working with default_catalog), the inherited methods are used; otherwise special "external" methods are used.

Primary overridden functions:

  • get_databases
  • get_tables_in_schema
  • get_table_details

Before
image

After
image

Notes

Integration testing against a real StarRocks instance is not present. There is no "in-memory" equivalent like what DuckDB and SQLite have.

image

Arrays of variable types default greedily to their inner types. There is currently no support for showing complex types with different icons from a standard string (i.e. ARRAY<INT> as not simply an integer). I'm not sure if this is relevant, but I thought I would mention it anyway.

📋 Checklist

  • I have read the contributor guidelines.
  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • Tests have been added for the changes made.
  • Documentation has been updated where applicable, including docstrings for API changes.
  • Pull request title is a good summary of the changes - it will be used in the release notes.

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Mar 24, 2026 9:18pm

Request Review

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 24, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@chris-celerdata
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class StarRocks support to marimo’s SQL datasource discovery by introducing a StarRocksEngine that can enumerate catalogs and introspect external-catalog objects via SHOW .../DESC SQL when SQLAlchemy’s inspector can’t.

Changes:

  • Implement StarRocksEngine(SQLAlchemyEngine) with multi-catalog discovery and external-catalog fallbacks.
  • Extend SQL identifier quoting to support the starrocks dialect (backtick style) and add related tests.
  • Register the new engine in engine detection and add StarRocks optional test dependency / dependency manager entry.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
marimo/_sql/engines/starrocks.py New engine implementing StarRocks catalog/database/table discovery with SQL fallbacks for external catalogs.
marimo/_sql/get_engines.py Adds StarRocksEngine to SUPPORTED_ENGINES so it’s selected before generic SQLAlchemyEngine.
marimo/_sql/sql_quoting.py Adds starrocks to backtick-quoting dialects.
marimo/_sql/sql.py Updates unsupported-engine error message to mention StarRocks.
marimo/_dependencies/dependencies.py Adds DependencyManager.starrocks.
pyproject.toml Adds starrocks to test-optional and bans module-level starrocks imports.
tests/_sql/test_starrocks.py New unit tests for StarRocks engine behavior using mocks.
tests/_sql/test_sql_quoting.py Adds StarRocks quoting/roundtrip cases and qualified-name quoting case.
tests/_sql/test_get_engines.py Adds StarRocks engine selection test and datasource connection mapping assertion.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Bundle Report

Bundle size has no change ✅

@dmadisetti dmadisetti added the enhancement New feature or request label Mar 24, 2026
@dmadisetti dmadisetti linked an issue Mar 25, 2026 that may be closed by this pull request
1 task
@Light2Dark
Copy link
Copy Markdown
Collaborator

Light2Dark commented Mar 25, 2026

hi @chris-celerdata, this sounds similar to an issue with other multi-db engines like Snowflake. There is a start of the fix here #8824

Would you mind having a look and seeing if that direction would fix this problem universally, rather than custom support for Starrocks? If not, we can also have a look (wish there was Starrocks cloud haha)

@chris-celerdata
Copy link
Copy Markdown
Author

chris-celerdata commented Mar 25, 2026

There is a StarRocks cloud. It can also be deployed on Docker. If you want to work with external catalogs, the documentation can be found here. However, the default_catalog being separated in the sidebar should be sufficient for testing.

I took a look at the other PR and it does not fix the problem; SQLAlchemyEngine is essentially untouched and the primary problem is that get_schemas and get_databases expect a singular database. The current implementation of get_databases in sqlalchemy.pywill only return one or zero databases. There is not an easy fix because SQLAlchemy does not have a standardized method for multi-database introspection (i.e. get_database_names or get_catalog_names). Any ideas for a more general approach would be appreciated!

Note: I considered doing something similar to get_default_database. In this function, there is a block that defines what SQL to call.

dialect_queries = {
    "postgresql": "SELECT current_database()",
    "mssql": "SELECT DB_NAME()",
    "timeplus": "SELECT current_database()",
}

This approach could be duplicated into get_databases for the relevant dialects. Would this be a better approach than a new engine?

@Light2Dark
Copy link
Copy Markdown
Collaborator

I think that's the eventual direction. That PR decouples get_schema, so eventually we can call get_databases and get all databases connected.

I'm okay to have this too, but there'll be some code drift once we merge that. In the meantime, would you want to create the frontend UI changes for adding a starrocks connection?

@chris-celerdata
Copy link
Copy Markdown
Author

I'd love to add StarRocks to the UI. I was referencing #8169, which talked about keeping the UI uncluttered. Has that changed?
Also, to be clear, you are talking about adding StarRocks to this dialog?
image

@Light2Dark
Copy link
Copy Markdown
Collaborator

Light2Dark commented Mar 27, 2026

@chris-celerdata , yeah. But I'm checking with the team on if we're okay to add to the UI.

The PR referenced #8824 has been merged, does this change this PR? Sorry for the back and forth on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Database: StarRocks

4 participants