Skip to content

Fix thread oversubscription of broadcast matmul (BLAS devices)#85

Merged
ajz34 merged 1 commit into
RESTGroup:masterfrom
ajz34:fix-broadcast-matmul-threading
Jun 25, 2026
Merged

Fix thread oversubscription of broadcast matmul (BLAS devices)#85
ajz34 merged 1 commit into
RESTGroup:masterfrom
ajz34:fix-broadcast-matmul-threading

Conversation

@ajz34

@ajz34 ajz34 commented Jun 25, 2026

Copy link
Copy Markdown
Member

This is my silly mistake.

For BLAS devices, the following code should never happens:

with_num_threads(1, || { // constrained thread num at main thread
    iterator.into_par_iter().for_each(|x| {
        // in spawned thread, the constraint is forgetted
        call_blas_function_without_thread_number_guard();
    });
});

Interchange with_num_threads and iter.into_par_iter should be correct, otherwise BLAS function must be called with thread number guard.

@ajz34 ajz34 merged commit f4d13f4 into RESTGroup:master Jun 25, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant