Skip to content

Decrease initial load on python package#513

Draft
Edwardvaneechoud wants to merge 1 commit into
mainfrom
improvement/flowfile-smaller-to-import
Draft

Decrease initial load on python package#513
Edwardvaneechoud wants to merge 1 commit into
mainfrom
improvement/flowfile-smaller-to-import

Conversation

@Edwardvaneechoud

Copy link
Copy Markdown
Owner

This pull request refactors how and when the database is initialized throughout the codebase, making database setup explicit and lazy, rather than relying on import-time side effects. It introduces an ensure_db_initialized() function that safely and idempotently initializes the database before any actual DB access, and updates all relevant code paths to use this function. Additionally, it delays heavy imports (such as deltalake and FastAPI-related modules) until they are actually needed, reducing import overhead for users of the dataframe API.

Database Initialization Refactor:

  • Added ensure_db_initialized() in flowfile_core.database.connection, which lazily and safely runs Alembic migrations and seeds default rows before database access. This replaces import-time initialization and is now called in all DB accessors and main entry points. [1] [2] [3]
  • Updated all places that interact with the database (including audit, metrics, catalog migration, and user utilities) to call ensure_db_initialized() before accessing the DB. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
  • Removed import-time DB initialization from flowfile_core.__init__ and init_db.py, so the DB is not created or migrated just by importing the core library. [1] [2] [3]

Lazy Import Improvements:

  • Delayed importing FastAPI server stack and requests in flowfile.__init__ by using a __getattr__ hook, so importing flowfile for dataframe functionality remains lightweight. [1] [2]
  • Moved all deltalake and pyarrow.dataset imports inside functions that actually use them, reducing import overhead and avoiding unnecessary dependencies for unrelated workflows. [1] [2] [3] [4] [5] [6] [7] [8]

Exception Handling Consistency:

  • Standardized exception handling in AI diff and connection handler modules to use FlowfileHTTPException instead of FastAPI's HTTPException, improving modularity and testability. [1] [2] [3]

These changes make the codebase more modular, improve startup performance for non-server workflows, and ensure database migrations and seeding happen exactly when needed.

@netlify

netlify Bot commented Jun 13, 2026

Copy link
Copy Markdown

Deploy Preview for flowfile-wasm canceled.

Name Link
🔨 Latest commit f0cdc15
🔍 Latest deploy log https://app.netlify.com/projects/flowfile-wasm/deploys/6a2d44696e5829000831a5ea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant