fbpx

The Complete Glossary to Heterogeneous Compute

GPU-compute_memory-hierarchy-in-OpenCL

This article was originally published at Imagination Technologies' website, where it is one of a series of articles. It is reprinted here with the permission of Imagination Technologies.

For the last decade, Imagination has been at the forefront of heterogeneous compute, becoming a founding member of the HSA Foundation and a contributor to many open heterogeneous computing standards available today, including OpenCL, OpenGL ES and Vulkan.

Our MIPS processors, PowerVR multimedia and Ensigma connectivity technologies have been integrated in many mobile and embedded computing platforms; each silicon IP family has been optimized to be a class leader in terms of performance while saving power and area.

In a series of upcoming articles on our blog, my colleagues from the GPU compute group will look at how SoC designers and software developers can take advantages of the synergies that exist today in silicon and implement heterogeneous algorithms that deploy across multiple engines in a chip. To help you navigate through the jargon of heterogeneous compute, I thought it would be useful to provide a short guide to the technical vocabulary that we are going to use.

Most of the terminology mentioned in the table below refers to our PowerVR Rogue GPU, OpenCL or GPU compute concepts in general:

Term Description
Arithmetic Intensity The ratio of the number of arithmetic operations to memory operations performed.
Barrier In OpenCL, a function used to synchronize work-items in a workgroup.
Coarse Grain Scheduler A Rogue hardware block that distributes work-items to the available multiprocessors. (The work-items are first grouped into warps.)
Common Store A Rogue hardware block comprising a register bank, shared between all resident work-items. All registers are visible to all work-items residing on the multiprocessor.
Kernel In OpenCL, the source code that is executed by each work-item in an NDRange.
Memory Fence In OpenCL, a location in the code where all pending loads and stores are guaranteed to have completed prior to any subsequent loads and stores having been commenced.
Multiprocessor A Rogue hardware block that manages the concurrent execution of multiple warps.
NDRange In OpenCL, an N-dimensional virtual grid of workgroups, where N can equal 1, 2 or 3. All work-items in the NDRange are executed concurrently.
Texture Processing Unit A Rogue hardware block that speeds up accesses to OpenCL images.
Unified Store A Rogue hardware block comprising a register bank, shared between all resident work-items.
Warp A grouping of up to 32 work-items.
Work-item In OpenCL, one instance of an enqueued kernel.
Workgroup In OpenCL, a grouping of work-items that can synchronize and share data between one another.

If you find other terms that are not explained or sound unfamiliar in other articles in this series, please leave us a comment below and I will add it to the list.

By Alexandru Voica
Senior Technology Marketing Specialist, Imagination Technologies

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

1646 N. California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone
Phone: +1 (925) 954-1411
Scroll to Top