Hi there,
I've attached a test case (a gutted version of a segmented scan) that gets miscompiled by intel_sdk_for_ocl_applications_2013_xe_runtime_3.0.67279_x64, at least when running on my i7-2620M.
Specifically, when running this self-contained test code using PyOpenCL, I get the line
gid:0 fsii:0
printed 16 times for each of the two work groups. If you look at the kernel, that means that the printf() in the trailing snippet:
if (get_local_id(0) == 0)
{
printf("gid:%d fsii:%d\n", psc_GID_0, psc_first_segment_start_in_interval);
}
got executed 16 times for group id 0. In my book, it should be executed exactly once. (Confirmed by running against other implementations. Intel OpenCL 2012 also gets this right.)
Thanks,
Andreas