I'm observing an ~25% performance drop in a kernel that is ~4.5 msecs of pure compute with one final write to an OpenGL render buffer.
The measurement is via a cl_event.
Whether the kernels are loaded as binaries or as source doesn't matter.
Did something change in the OpenCL driver with Vulkan integration?
I'm wondering how performance can vary so much when the kernel hasn't changed and its only interaction with "external" resources is limited to a single store to a 1 megapixel renderbuffer?