I want to deeply profile and analyze my Intel Graphics OpenCL kernels and am not entirely clear on which tools are available and their advantages.
I assume VTune is the very best? What specifically does it offer for OpenCL developers that's useful?
What about GPA? Last I checked it didn't provide much detail.
Is GT-Pin something that's documented or is it what's actually powering the kernel analysis in Code Builder and Visual Studio?
I'd like to understand:
- achieved EU occupancy
- where any local memory bank conflicts might be occurring
- instruction hot spots
I may be missing some features that already exist in the Visual Studio (2013) CodeBuilder because the Kernel Analysis screen crashes when analyzing my application which uses pre-compiled binary kernels (with "-cl-kernel-arg-info" enabled).
Also, I'd like to be able to dump the GEN assembly from an existing .IR binary. Is there a way to do this?