Skip to content

feat(sql): cache SQLAlchemy engine per database URL to eliminate redundant TCP handshake#669

Open
michael-johnston wants to merge 1 commit intomainfrom
maj_cache_sql_alchemy_engine
Open

feat(sql): cache SQLAlchemy engine per database URL to eliminate redundant TCP handshake#669
michael-johnston wants to merge 1 commit intomainfrom
maj_cache_sql_alchemy_engine

Conversation

@michael-johnston
Copy link
Member

Without caching, every component that calls engine_for_sql_store for the same database URL (the metastore SQLResourceStore and the samplestore SQLSampleStore being the primary pair) each calls sqlalchemy.create_engine, which creates a separate connection pool and therefore opens a separate TCP+TLS+MySQL auth connection.

Over a high-latency link the full handshake can const 1-3. With two independent engines, both the metastore and samplestore each paid this cost, adding an addition 1-3s to every CLI command.

_engine_cache (a process-level dict keyed by connection URL) ensures that the second call for the same URL returns the already-initialised Engine and reuses its warm connection pool, reducing the connection cost from 2x to 1x handshake per process lifetime.

Local measured saving on ado show requests/results/entities: ~2,360ms.

…ndant TCP handshake

Without caching, every component that calls engine_for_sql_store for the same
database URL (the metastore SQLResourceStore and the samplestore SQLSampleStore
being the primary pair) each calls sqlalchemy.create_engine, which creates a
separate connection pool and therefore opens a separate TCP+TLS+MySQL auth
connection.

Over a high-latency link the full handshake can const 1-3. With two independent engines,
both the metastore and samplestore each paid this cost, adding an addition 1-3s to every CLI command.

_engine_cache (a process-level dict keyed by connection URL) ensures that the
second call for the same URL returns the already-initialised Engine and reuses
its warm connection pool, reducing the connection cost from 2x to 1x handshake per process lifetime.

Local measured saving on `ado show requests/results/entities`: ~2,360ms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant