It can be challenging for developers to match computations to accelerators, choose models for targeting those accelerators, and then coordinate the use of those accelerators in the context of their larger applications. This tutorial starts with a survey of heterogeneous architectures and programming models, and discusses how to determine if a computation is suitable for a particular accelerator. The library provides generic parallel algorithms, concurrent containers, a work-stealing task scheduler, a data flow programming abstraction, low-level primitives for synchronization, thread local storage and a scalable memory allocator. The generic algorithms in TBB capture many of the common design patterns used in parallel programming. While TBB was first introduced in as a shared-memory parallel programming library, it has recently been extended to support heterogeneous programming. This tutorial will introduce students to the TBB library and provide a hands-on opportunity to use some of its features for shared-memory programming.
|Genre:||Health and Food|
|Published (Last):||2 April 2013|
|PDF File Size:||18.34 Mb|
|ePub File Size:||12.81 Mb|
|Price:||Free* [*Free Regsitration Required]|
You might want to go get yourself some coffee, because this is a rather lengthy step. Loop parallelization is one of the easiest ways to achieve parallelism from a single-threaded code. It is generally most useful for embarassingly data parallel applications, but can be used elsewhere with some programmer effort.
Summing two Arrays with TBB The age-old problem: given two arrays, each element of the first array with its countepart in the second. Admittedly, the problem is not horribly interesting, but can still benefit from parallelism, provided the arrays are reasonably large. Follow along with main. Strategy: Write some serial code to sum the arrays into a third result array. Then let TBB autoparallelize the process. Writing Serial Code To start off, after we initialize all the memory, parse arguments, etc.
The single-thread summing occurs at Lines of main. C and at Lines of main. TBB implements parallel loops by encapsulating them inside operator functions of specialized classes. This allows the TBB library headers to handle the parallelism without making any modifications to the compiler. In the example, class ArraySummer is actually an elaborate function definition. The empty constructor just initializes the "function parameters" aka the class data members , and the operator function actually runs the loop.
Setting up the Runtime Unpacke the Example Tarball wherever you like. To actually compile with TBB, we have to set some environment variables. A handy shell script for setting up the environment is sitting in your TBB install directory.
You must source this script before building the example or any TBB-enabled application!
Learn about the Intel® Threading Building Blocks library
Intel's Thread Building Blocks: HowTo
Tutorial: Develop an Application With Intel® Threading Building Blocks