OpenCL Concepts on FPGA (NDRange vs Single-task kernel)

Hello,

I have some theoretical questions to have a better understanding of Intel FPGA OpenCL Compiler. First of all, I still don't know when to prefer NDRange Kernel over Single-Task Kernel. To my understanding, it is possible to data-parallelize the kernel with more flexibility in single-task kernel by using unroll loop pragma. By using this pragma we can indicate the parts we want vectorization. On the other hand NDRange Kernel offers simd pragma, which is bound to multiples of 2(why?) and requires the programmer to fix the size of work group size. NDRange concept fits well to other OpenCL platforms because of their fixed hardware consisting of multiple compute units but I can not grasp its necessity for FPGA.

Secondly, I would like to know when to prefer multiple compute units over simd. According to Best Practices Guide it is a bit of experimentation with the numbers to get the best results(best combination of compute units and simd) . But I can not think of a possible scenario that we have n compute units that has no memory coalescing would give better performance than n simd units. It comes to me as if it is always better to decrease number of compute units by a factor of n and increase simd units by the same factor(as long as we have enough resources). If this is the case, what is the justification of existence of multiple compute units pragma?

Lastly, after optimizing the number of compute units and simd units, what procedure should we follow in order to find the best work group size? Best Practices Guide states that each work group can only work on one compute unit. So, that should mean that the number of work groups we have must be a multiple of number of compute units we created for better performance(or not?). I always aimed for having least value for reqd_work_group_size attribute so that choosing global work size becomes easier(as it has to be a multiple of work group size for my device). What is a more elegant way of choosing work group size?

Regards,

Gorkem

OpenCL Concepts on FPGA (NDRange vs Single-task kernel)

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112