StarPU

Task Graph Market

Task Graph Market

This page gathers a series of task graphs which can be given as input to starpu_replay for replaying real-world applications

To get starpu_replay, one needs a version of starpu configured with --enable-simgrid . One can then start the different task graph cases. See more details below.

Dense Linear Algebra

Cholesky

Cholesky factorization from the StarPU source code, benchmarked on research platforms.

Chameleon

Dense linear algebra from the Chameleon project, benchmarked on research platforms.

How to run this

How to generate static scheduling

The examples above were using the StarPU dynamic schedulers. One can inject static scheduling by adding a sched.rec file into the play.

The tasks.rec file is following the recutils format: some paragraphs are separated by an empty line. Each paragraph represents a task to be executed, with a lot of information, some of which is coming from the native execution that was performed when recording the trace:

The performance of tasks on the different execution units can be obtained by running starpu_perfmodel_recdump:

$ STARPU_HOSTNAME=mirage STARPU_PERF_MODEL_DIR=$PWD/sampling starpu_perfmodel_recdump
which first emits in a %rec: timing section a series of paragraphs, one per set of measurements made for the same kind of task on the same data size. Each paragraph contains: Then the %rec: worker_count section describes the target platform, with one paragraph per kind of execution unit: Then the %rec: memory_workers section describes the memory layout of the target platform, with one paragraph per memory node: Workers IDs are numbered starting from 0 and according to the order of the paragraphs in the %rec: worker_count section.

A static schedule can then be expressed by producing a sched.rec file containing one paragraph per task. Each of them must contain a SubmitOrder field containing the submission identifier (as referenced in the SubmitOrder field of tasks.rec). The reason why the JobId is not used is because StarPU may generate internal tasks, which will change job ids. The SubmitOrder, on the contrary, only depends on the application submission loop, and is thus completely stable, making it even possible to inject the static scheduling in a native execution with the real application. The paragraph can then contain optionally several kinds of scheduling directives, either to force task placement for instance, or to guide the StarPU dynamic scheduler:

For instance, a completely static schedule can be set by setting, for each task, both the SpecificWorker and the Workerorder field, thus respectively specifying for each task on which worker it shall run, and its ordering on that worker. For instance:
SubmitOrder: 0
SpecificWorker: 0
Workerorder: 2

SubmitOrder: 1
SpecificWorker: 1
Workerorder: 0

SubmitOrder: 2
SpecificWorker: 0
Workerorder: 1
will force task 0 and 2 to be executed on worker 0 while task 1 will be executed on worker 1, and 2 will be executed before task 0.

When the SpecificWorker field is set for a task, or its Workers field corresponds to only one memory node, StarPU will automatically prefetch the data during execution. One can however also set prefetches by hand in sched.rec by using a paragraph containing:

This for instance allows not to specify precise task scheduling hints, but provide data prefetch hints which will probably guide the scheduler into a given data placement.

Last updated on 2019/09/06.