I have a basic question on the number of work groups that can run in parallel. According to the definition of compute unit each compute unit can have only one work group, so number of work groups which can be run concurrently depends "only" on the number of CU present .But in the "OpenCL* Applications - Optimization Guide" it has been specified that number of work groups depends on the number of the work items in a group .
↧