Quantcast
Channel: Intel® Software - OpenCL*
Viewing all articles
Browse latest Browse all 1182

Broadwell IGP needs more sub_group functions

$
0
0

OpenCL 2.0 has no support for a "ballot" style sub-group function.  A ballot returns bitmask containing the conditional flag for each "lane" in the sub-group.  As long as the sub-group (SIMD) size is 32 or less then this fits in a cl_uint.

Presumably sub-group any() and all() are implemented on Broadwell IGP by returning an ARF flag register?

It would be great if Broadwell IGP unofficially implemented sub_group_any() by returning the actual flag bitmask so that developers could apply popcount() and other operations to the mask.

For those not aware, a classic use case for a ballot mask is packing data in a sub-group into a local memory array without having to use a full exclusive add scan.  It's very efficient.

You can implement a ballot() with an inclusive scan but that's going to be ~8x as many ops for SIMD16.

 


Viewing all articles
Browse latest Browse all 1182

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>