Hello everyone!
I've faced an issue relating to OpenCL implementation on Intel HD 4000 GPU (i5-3370U). When I'm building processing chain and enqueue compute kernels to cl_command_queue, runtime could just freeze on some clEnqueueNDRangeKernel() call. This bug isn't observed on any CPU or HD 4600 GPU (i7-4770K). All AMD platforms are also fine. I've provided a sample that demonstrates the issue. I suppose there is some kind of deadlock in OpenCL runtime threads. The test can be passed if number of iterations or steps in main processing loop are reduced. It also works if you add clWaitForEvents() during formation of processing chain (commented in provided source). By the way, the system is Windows 8.1 x64 with the latest 3345 drivers and 2013 R3 SDK.