Skip to content

Feature/64 create framework for benchmarking#66

Merged
jathavaan merged 12 commits intomainfrom
feature/64-create-framework-for-benchmarking
Feb 18, 2026
Merged

Feature/64 create framework for benchmarking#66
jathavaan merged 12 commits intomainfrom
feature/64-create-framework-for-benchmarking

Conversation

@jathavaan
Copy link
Collaborator

This pull request introduces a CPU and RAM monitoring system for the main pipeline execution, refactors DataFrame-to-bytes conversion to use Parquet format, and updates dependency injection and configuration to support these features. The monitoring system logs resource usage during pipeline runs and uploads the results to blob storage for benchmarking and analysis.

Resource monitoring and logging:

  • Added a monitor_cpu_and_ram decorator in src/application/common/monitor.py that samples process CPU and RAM usage during pipeline execution, logs the data to Parquet and CSV, and uploads results to blob storage in the new benchmarks container. The decorator is applied to the main function in main.py, generating a unique run_id for each run. [1] [2] [3]
  • Updated configuration to specify a MONITOR_LOG_DIRECTORY for storing local logs.
  • Modified dependency injection setup to wire the new monitor module.

DataFrame serialization improvements:

  • Changed the interface and implementation for converting DataFrames to bytes: replaced convert_df_to_bytes with convert_df_to_parquet_bytes, standardizing on Parquet format for serialization. [1] [2]
  • Updated release creation logic to use the new Parquet-based conversion method when uploading release metadata.

Minor cleanup:

  • Removed unused imports in release_pipeline.py.

@jathavaan jathavaan self-assigned this Feb 18, 2026
Copilot AI review requested due to automatic review settings February 18, 2026 10:16
@jathavaan jathavaan linked an issue Feb 18, 2026 that may be closed by this pull request
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces a CPU and RAM monitoring system for benchmarking the main pipeline execution. The monitoring decorator samples resource usage during runs, logs data locally and to Azure blob storage. Additionally, it refactors DataFrame-to-bytes conversion to consistently use Parquet format across the codebase.

Changes:

  • Added a monitoring decorator (monitor_cpu_and_ram) that samples CPU and RAM usage during pipeline execution, with results saved to local files and uploaded to blob storage
  • Refactored DataFrame serialization by replacing convert_df_to_bytes with convert_df_to_parquet_bytes for consistent Parquet-based serialization
  • Updated configuration and dependency injection to support the new monitoring module

Reviewed changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/application/common/monitor.py New monitoring module with CPU/RAM sampling decorator and helper functions for benchmarking
src/application/common/init.py Exports the new monitor_cpu_and_ram decorator
main.py Applies the monitoring decorator to the main function with a unique run ID
src/application/contracts/bytes_service_interface.py Renames method from convert_df_to_bytes to convert_df_to_parquet_bytes
src/infra/infrastructure/services/bytes_service.py Implements convert_df_to_parquet_bytes using BytesIO and Parquet serialization
src/infra/infrastructure/services/release_service.py Updates to use the new convert_df_to_parquet_bytes method
src/presentation/configuration/app_config.py Wires the monitor module for dependency injection
src/presentation/entrypoints/release_pipeline.py Removes unused Dict import
src/config.py Adds MONITOR_LOG_DIRECTORY configuration
src/domain/enums/storage_container.py Adds BENCHMARKS container enum value
requirements.txt Adds development dependencies (objprint, pandas-stubs, types-pytz, viztracer) and removes platform markers from pywin32/pywinpty
.gitignore Excludes generated monitoring logs (CSV and Parquet files)
monitor_logs/.gitkeep Placeholder for monitoring logs directory

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jathavaan jathavaan merged commit 2503db2 into main Feb 18, 2026
@jathavaan jathavaan deleted the feature/64-create-framework-for-benchmarking branch February 18, 2026 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create framework for benchmarking

1 participant

Comments