@@ -127,45 +127,106 @@ Combines ``target``, ``teams``, ``distribute``, and ``parallel for`` directives.
127127
128128OpenMP runtime functions
129129-------------------------
130- **Thread and team information: **
131130
132- * ``omp_get_thread_num() `` - Returns the unique identifier of the calling thread
133- * ``omp_get_num_threads() `` - Returns the total number of threads in the current parallel region
134- * ``omp_set_num_threads(n) `` - Sets the number of threads for subsequent parallel regions
135- * ``omp_get_max_threads() `` - Returns the maximum number of threads available
136- * ``omp_get_num_procs() `` - Returns the number of processors in the system
137- * ``omp_get_thread_limit() `` - Returns the thread limit for the parallel region
138- * ``omp_in_parallel() `` - Returns 1 if called within a parallel region, 0 otherwise
139- * ``omp_get_team_num() `` - Returns the team number in a target region
140- * ``omp_get_num_teams() `` - Returns the number of teams in a target region
131+ Thread and team information
132+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
133+
134+ .. list-table ::
135+ :widths: 35 65
136+
137+ * - **omp_get_thread_num() **
138+ - Returns the unique identifier of the calling thread
139+ * - **omp_get_num_threads() **
140+ - Returns the total number of threads in the current parallel region
141+ * - **omp_set_num_threads(n) **
142+ - Sets the number of threads for subsequent parallel regions
143+ * - **omp_get_max_threads() **
144+ - Returns the maximum number of threads available
145+ * - **omp_get_num_procs() **
146+ - Returns the number of processors in the system
147+ * - **omp_get_thread_limit() **
148+ - Returns the thread limit for the parallel region
149+ * - **omp_in_parallel() **
150+ - Returns 1 if called within a parallel region, 0 otherwise
151+ * - **omp_get_team_num() **
152+ - Returns the team number in a target region
153+ * - **omp_get_num_teams() **
154+ - Returns the number of teams in a target region
155+
156+ Timing
157+ ~~~~~~
141158
142- **Timing: **
159+ .. list-table ::
160+ :widths: 35 65
143161
144- * ``omp_get_wtime() `` - Returns elapsed wall-clock time (useful for performance profiling)
162+ * - **omp_get_wtime() **
163+ - Returns elapsed wall-clock time (useful for performance profiling)
145164
146- **Nested and hierarchical parallelism: **
165+ Nested and hierarchical parallelism
166+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147167
148- * ``omp_set_nested(flag) `` - Enables or disables nested parallelism
149- * ``omp_set_dynamic(flag) `` - Enables or disables dynamic thread adjustment
150- * ``omp_set_max_active_levels(n) `` - Sets the maximum number of nested parallel levels
151- * ``omp_get_max_active_levels() `` - Returns the maximum number of nested parallel levels
152- * ``omp_get_level() `` - Returns the current nesting level
153- * ``omp_get_active_level() `` - Returns the current active nesting level
154- * ``omp_get_ancestor_thread_num(level) `` - Returns the thread number at a given nesting level
155- * ``omp_get_team_size(level) `` - Returns the team size at a given nesting level
156- * ``omp_get_supported_active_levels() `` - Returns the supported number of nested active levels
168+ .. list-table ::
169+ :widths: 35 65
170+
171+ * - **omp_set_nested(flag) **
172+ - Enables or disables nested parallelism
173+ * - **omp_set_dynamic(flag) **
174+ - Enables or disables dynamic thread adjustment
175+ * - **omp_set_max_active_levels(n) **
176+ - Sets the maximum number of nested parallel levels
177+ * - **omp_get_max_active_levels() **
178+ - Returns the maximum number of nested parallel levels
179+ * - **omp_get_level() **
180+ - Returns the current nesting level
181+ * - **omp_get_active_level() **
182+ - Returns the current active nesting level
183+ * - **omp_get_ancestor_thread_num(level) **
184+ - Returns the thread number at a given nesting level
185+ * - **omp_get_team_size(level) **
186+ - Returns the team size at a given nesting level
187+ * - **omp_get_supported_active_levels() **
188+ - Returns the supported number of nested active levels
189+
190+ Advanced features
191+ ~~~~~~~~~~~~~~~~~
157192
158- **Advanced features: **
193+ .. list-table ::
194+ :widths: 35 65
195+
196+ * - **omp_get_proc_bind() **
197+ - Returns the processor binding policy
198+ * - **omp_get_num_places() **
199+ - Returns the number of available places
200+ * - **omp_get_place_num_procs(place) **
201+ - Returns the number of processors in a place
202+ * - **omp_get_place_num() **
203+ - Returns the current place number
204+ * - **omp_in_final() **
205+ - Returns 1 if called in a final task, 0 otherwise
206+
207+ Device and target offloading
208+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159209
160- * ``omp_get_proc_bind() `` - Returns the processor binding policy
161- * ``omp_get_num_places() `` - Returns the number of available places
162- * ``omp_get_place_num_procs(place) `` - Returns the number of processors in a place
163- * ``omp_get_place_num() `` - Returns the current place number
210+ .. list-table ::
211+ :widths: 35 65
212+
213+ * - **omp_get_num_devices() **
214+ - Returns the number of available target devices
215+ * - **omp_get_device_num() **
216+ - Returns the device number of the current target device
217+ * - **omp_set_default_device(device_id) **
218+ - Sets the default device for subsequent target regions
219+ * - **omp_get_default_device() **
220+ - Returns the default device ID for target regions
221+ * - **omp_is_initial_device() **
222+ - Returns 1 if executing on the initial device (host), 0 otherwise
223+ * - **omp_get_initial_device() **
224+ - Returns the device ID of the initial device (host)
164225
165226Supported features and platforms
166227---------------------------------
167228
168- OpenMP and GPU Offloading Support
229+ OpenMP and GPU offloading support
169230~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
170231
171232PyOMP builds on `Numba <https://numba.pydata.org/ >`_ Just-In-Time (JIT)
@@ -179,6 +240,111 @@ PyOMP also supports GPU offloading for NVIDIA GPUs. The supported GPU
179240architectures depend on the LLVM version and its OpenMP runtime. Consult the
180241LLVM OpenMP documentation for details on your specific version.
181242
243+ Device selection and querying
244+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
245+
246+ PyOMP provides utilities in the ``offloading `` module to query available OpenMP target
247+ devices and select specific devices for offloading based on device type, vendor, and
248+ architecture. This enables fine-grained control over where target regions execute.
249+
250+ Discovering Available Devices
251+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
252+
253+ To see all available devices and their properties, use ``print_offloading_info() ``:
254+
255+ .. code-block :: python
256+
257+ from numba.openmp.offloading import print_offloading_info
258+
259+ print_offloading_info()
260+
261+ This prints information about all devices, including device counts and default device settings.
262+
263+ Finding devices by criteria
264+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
265+
266+ To programmatically find device IDs matching specific criteria, use ``find_device_ids() ``:
267+
268+ .. code-block :: python
269+
270+ from numba.openmp.offloading import find_device_ids
271+
272+ # Find all GPU devices
273+ gpu_devices = find_device_ids(type = " gpu" )
274+
275+ # Find all NVIDIA GPUs
276+ nvidia_gpus = find_device_ids(vendor = " nvidia" )
277+
278+ # Find NVIDIA GPUs with specific architecture (e.g., sm_80)
279+ sm80_gpus = find_device_ids(vendor = " nvidia" , arch = " sm_80" )
280+
281+ # Find all AMD GPUs
282+ amd_gpus = find_device_ids(vendor = " amd" )
283+
284+ # Find host/CPU device
285+ host_devices = find_device_ids(type = " host" )
286+
287+ The function returns a list of device IDs (integers) matching the criteria. Any parameter
288+ can be ``None `` to act as a wildcard and match all values.
289+
290+ Querying device properties
291+ ^^^^^^^^^^^^^^^^^^^^^^^^^^
292+
293+ To determine the type, vendor, or architecture of a specific device ID, use the property
294+ getter functions:
295+
296+ .. code-block :: python
297+
298+ from numba.openmp.offloading import (
299+ get_device_type,
300+ get_device_vendor,
301+ get_device_arch,
302+ )
303+
304+ # Check device type
305+ dev_type = get_device_type(device_id) # Returns "gpu", "host", or None
306+
307+ # Check vendor
308+ vendor = get_device_vendor(device_id) # Returns "nvidia", "amd", "host", or None
309+
310+ # Check architecture
311+ arch = get_device_arch(device_id) # Returns architecture string or None
312+
313+ Using device ids in target regions
314+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
315+
316+ Once you have identified a device ID, you can use it in OpenMP target directives via the
317+ ``device `` clause:
318+
319+ .. code-block :: python
320+
321+ from numba.openmp import njit, openmp_context as openmp
322+ from numba.openmp.offloading import find_device_ids
323+ import numpy as np
324+
325+ # Find first available NVIDIA GPU
326+ nvidia_devices = find_device_ids(vendor = " nvidia" )
327+ if nvidia_devices:
328+ device_id = nvidia_devices[0 ]
329+ else :
330+ # Fall back to host if no NVIDIA GPU found
331+ device_id = find_device_ids(type = " host" )[0 ]
332+
333+
334+ @njit
335+ def inc (x ):
336+ with openmp(f " target loop device( { device_id} ) map(tofrom: x) " ):
337+ # Computation runs on specified device
338+ for i in range (len (x)):
339+ x[i] = x[i] + 1
340+
341+ return x
342+
343+
344+ x = inc(np.ones(10 ))
345+ print (f " Result on device { device_id} : { x} " )
346+
347+
182348 Version and platform support
183349~~~~~~~~~~~~~~~~~~~~~~~~~~~~
184350
@@ -195,7 +361,19 @@ The following table shows tested combinations of PyOMP, Numba, Python, LLVM, and
195361 0.3.x 0.57.x - 0.60.x 3.9 - 3.12 14.x linux-64, osx-arm64, linux-arm64
196362 ===================== ==================== ==================== ============ ================================
197363
198- Platform Details
364+ OpenMP parallelism support by platform
365+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
366+
367+ =========== ================ ================= ===================
368+ Platform CPU NVIDIA GPU AMD GPU
369+ =========== ================ ================= ===================
370+ linux-64 ✅ Supported ✅ Supported 🔶 Work in progress
371+ linux-arm64 ✅ Supported ✅ Supported 🔶 Work in progress
372+ osx-arm64 ✅ Supported ❌ Unsupported ❌ Unsupported
373+ =========== ================ ================= ===================
374+
375+
376+ Platform details
199377^^^^^^^^^^^^^^^^
200378
201379* **linux-64 **: Linux x86_64 architecture
0 commit comments