Quantcast
Channel: Intel® Software - OpenCL*
Viewing all articles
Browse latest Browse all 1182

How to know the number of compute units used when using a CPU as an OpenCL device

$
0
0

I am running a program using Intel OpenCL 1.2. My OpenCL device is a CPU:

[lvs@eredmithrim CapsBasic]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 94
Model name:            Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz
Stepping:              3
CPU MHz:               3501.000
BogoMIPS:              7007.99
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
NUMA node0 CPU(s):     0-3

And regarding the OpenCL runtime available:

[lvs@eredmithrim CapsBasic]$ ./CapsBasic
Number of available platforms: 1
Platform names:
    [0] Intel(R) OpenCL [Selected]
Number of devices available for each type:
    CL_DEVICE_TYPE_CPU: 1
    CL_DEVICE_TYPE_GPU: 0
    CL_DEVICE_TYPE_ACCELERATOR: 0

*** Detailed information for each device ***

CL_DEVICE_TYPE_CPU[0]
    CL_DEVICE_NAME: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz
    CL_DEVICE_AVAILABLE: 1
    CL_DEVICE_VENDOR: Intel(R) Corporation
    CL_DEVICE_PROFILE: FULL_PROFILE
    CL_DEVICE_VERSION: OpenCL 1.2 (Build 57)
    CL_DRIVER_VERSION: 1.2.0.57
    CL_DEVICE_OPENCL_C_VERSION: OpenCL C 1.2
    CL_DEVICE_MAX_COMPUTE_UNITS: 4
    CL_DEVICE_MAX_CLOCK_FREQUENCY: 3500
    CL_DEVICE_MAX_WORK_GROUP_SIZE: 8192
    CL_DEVICE_ADDRESS_BITS: 64
    CL_DEVICE_MEM_BASE_ADDR_ALIGN: 1024
    CL_DEVICE_MAX_MEM_ALLOC_SIZE: 4125402112
    CL_DEVICE_GLOBAL_MEM_SIZE: 16501608448
    CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 131072
    CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 262144
    CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64
    CL_DEVICE_LOCAL_MEM_SIZE: 32768
    CL_DEVICE_PROFILING_TIMER_RESOLUTION: 1
    CL_DEVICE_IMAGE_SUPPORT: 1
    CL_DEVICE_ERROR_CORRECTION_SUPPORT: 0
    CL_DEVICE_HOST_UNIFIED_MEMORY: 1
    CL_DEVICE_EXTENSIONS: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_fp64
    CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 1
    CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 1
    CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 1
    CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 1
    CL_DEVICE_NATIVE_VECTOR_WIDTH_INT: 8
    CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG: 4
    CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT: 8
    CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE: 4
[lvs@eredmithrim CapsBasic]$

 

My application has four kernels and each of them has several workgroups.

I would like to know how many compute units this program is actually using (The only info I can see above is the max number of them, but I think  CL_DEVICE_MAX_COMPUTE_UNITS is just a reference and the actual number of compute units used may be different).

I wonder if there is a way to control the number of compute units or if this is a runtime-based decision. Any comments on this?

Any info or pointers are appreciated.

Leonardo

 


Viewing all articles
Browse latest Browse all 1182

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>