This post is my notepad while figuring out how OpenCL handles assigning work item ids.
- A Kernel is invoked once for each work item. Each work item has private memory.
- Work items are grouped into a work group. Each work group shares local memory
- The total number of all work items is specified by the global work size. global and constants memory is shared across al work work items of all work groups.
OpenCL works with the notion of dimension, that means you can declare your number of work items by giving them dimensional indices. In the above example, the size of a work group Sx=4 and Sy=4. How many dimensions you use is up to you, however there’s a physical limit on the maximum number of total work items per group as well as globally.
Inside a kernel, you can query the position of the work item this kernel instance is executing relative to the group or global.
Querying the global position is done using get_global_id(dim) where dim is the dimension index (0 for first, 1 for second dimension etc.) The above call is equivalent to get_local_size(dim)*get_group_id(dim) + get_local_id(dim). get_local_size(dim) is the group size in dim, get_group_id(dim) is the group position in dim relative to all other groups (globally) and get_local_id(dim) is the position of a work item relative to the group. You can see this in the following annotated figure:
Since the OpenCL APIs only reuire you to specify global size (total number of work items in a dimension) and local size (number of work items per group) this means that the number of groups is inferred from that data.