perf: implement multithreaded I/O optimization for high-core-count CPUs#124
perf: implement multithreaded I/O optimization for high-core-count CPUs#124SpaceMarty wants to merge 1 commit intonikopueringer:mainfrom
Conversation
Refactor optimization logic into a reusable RuntimeThreadPool context manager in device_utils.py. Applied this refactor to clip_manager.py and backend/service.py to improve inference throughput on systems like Threadripper by overlapping I/O with GPU inference.
|
this seems like it would be super useful. Looks like some tests failed. If the tests themselves have issues, feel free to let me know. |
|
@SpaceMarty, I tried executing the cli for this PR locally and ran into error. Can you please look into it? |
Code Review — Community ContributionHey @SpaceMarty, thanks for tackling I/O parallelisation — this is a real bottleneck on high-core-count systems. The 1.
|
Refactor optimization logic into a reusable RuntimeThreadPool context manager in device_utils.py.
Applied this refactor to clip_manager.py and backend/service.py to improve inference throughput on systems like Threadripper by overlapping I/O with GPU inference.