feat(sql): cache SQLAlchemy engine per database URL to eliminate redundant TCP handshake#669
Open
michael-johnston wants to merge 1 commit intomainfrom
Open
feat(sql): cache SQLAlchemy engine per database URL to eliminate redundant TCP handshake#669michael-johnston wants to merge 1 commit intomainfrom
michael-johnston wants to merge 1 commit intomainfrom
Conversation
…ndant TCP handshake Without caching, every component that calls engine_for_sql_store for the same database URL (the metastore SQLResourceStore and the samplestore SQLSampleStore being the primary pair) each calls sqlalchemy.create_engine, which creates a separate connection pool and therefore opens a separate TCP+TLS+MySQL auth connection. Over a high-latency link the full handshake can const 1-3. With two independent engines, both the metastore and samplestore each paid this cost, adding an addition 1-3s to every CLI command. _engine_cache (a process-level dict keyed by connection URL) ensures that the second call for the same URL returns the already-initialised Engine and reuses its warm connection pool, reducing the connection cost from 2x to 1x handshake per process lifetime. Local measured saving on `ado show requests/results/entities`: ~2,360ms.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Without caching, every component that calls engine_for_sql_store for the same database URL (the metastore SQLResourceStore and the samplestore SQLSampleStore being the primary pair) each calls sqlalchemy.create_engine, which creates a separate connection pool and therefore opens a separate TCP+TLS+MySQL auth connection.
Over a high-latency link the full handshake can const 1-3. With two independent engines, both the metastore and samplestore each paid this cost, adding an addition 1-3s to every CLI command.
_engine_cache (a process-level dict keyed by connection URL) ensures that the second call for the same URL returns the already-initialised Engine and reuses its warm connection pool, reducing the connection cost from 2x to 1x handshake per process lifetime.
Local measured saving on
ado show requests/results/entities: ~2,360ms.