StarPU Publications
Table of Contents
- General Presentations
- On Composability
- On Parallel Tasks
- On Hierarchical Tasks
- On Scheduling
- On The C Extensions
- On OpenMP Support on top of StarPU
- On MPI Support
- On Memory Control
- On Performance Model Tuning
- On The Simulation Support through SimGrid
- On The Cell Support
- Papers related to StarPU
- On Applications
All StarPU related publications are also listed here with the corresponding Bibtex entries.
A good overview is available in the following Research Report 'StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines'.
If you need to cite StarPU, please reference StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures for a general presentation. Other sub-sections below will give you references for more specific aspects of StarPU.
General Presentations
- Samuel Thibault
On Runtime Systems for Task-based Programming on Heterogeneous Platforms
Habilitation à diriger des recherches, Université de Bordeaux, December 2018
[WWW] [PDF] - Cédric Augonnet
Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System's Perspective
PhD thesis, Université de Bordeaux, December 2011
[WWW] [PDF] - Cédric Augonnet, Samuel Thibault, Raymond Namyst, and Pierre-André Wacrenier
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
CCPE - Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, 23:187-198, February 2011
[WWW] [PDF] [doi:10.1002/cpe.1631] - Cédric Augonnet, Samuel Thibault, and Raymond Namyst
StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines
Research Report RR-7240, INRIA, March 2010
[WWW] [PDF] - Cédric Augonnet
StarPU: un support exécutif unifié pour les architectures multicoeurs hétérogènes
In 19èmes Rencontres Francophones du Parallélisme, Toulouse, France, September 2009
Note: Best Paper Award
[WWW] [PDF] - Cédric Augonnet, Samuel Thibault, Raymond Namyst, and Pierre-André Wacrenier
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
In Euro-Par - 15th International Conference on Parallel Processing, volume 5704 of LNCS, Delft, The Netherlands, pages 863-874, August 2009
Springer
[WWW] [PDF] [doi:10.1007/978-3-642-03869-3_80] - Cédric Augonnet
Vers des supports d'exécution capables d'exploiter les machines multicoeurs hétérogènes
Master Thesis, Université de Bordeaux, June 2008
[WWW] [PDF] - Cédric Augonnet and Raymond Namyst
A unified runtime system for heterogeneous multicore architectures
In Proceedings of the International Euro-Par Workshops 2008, HPPC'08, volume 5415 of LNCS, Las Palmas de Gran Canaria, Spain, pages 174-183, August 2008
Springer
ISBN: 978-3-642-00954-9
[WWW] [PDF] [doi:10.1007/978-3-642-00955-6_22]
On Composability
- Andra-Ecaterina Hugo
Composability of parallel codes on heterogeneous architectures
Ph.D Thesis, Université de Bordeaux, December 2014
[WWW] [PDF] - Andra Hugo
Le problème de la composition parallèle : une approche supervisée
In 21èmes Rencontres Francophones du Parallélisme (RenPar'21), Grenoble, France, January 2013
[WWW] [PDF] - Andra Hugo, Abdou Guermouche, Raymond Namyst, and Pierre-André Wacrenier
Composing multiple StarPU applications over heterogeneous machines: a supervised approach
In Third International Workshop on Accelerators and Hybrid Exascale Systems, Boston, USA, May 2013
[WWW] [PDF] - Andra Hugo
Composabilité de codes parallèles sur architectures hétérogènes
Master Thesis, Université de Bordeaux, June 2011
[WWW] [PDF]
On Parallel Tasks
- Terry Cojean
Programmation of heterogeneous architectures using moldable tasks
Ph.D Thesis, Université de Bordeaux, March 2018
[WWW] [PDF] - Olivier Beaumont, Terry Cojean, Lionel Eyraud-Dubois, Abdou Guermouche, and Suraj Kumar
Scheduling of Linear Algebra Kernels on Multiple Heterogeneous Resources
In International Conference on High Performance Computing, Data, and Analytics (HiPC), Hyderabad, India, December 2016
[WWW] [PDF] - Terry Cojean, Abdou Guermouche, Andra Hugo, Raymond Namyst, and Pierre-André Wacrenier
Resource aggregation for task-based Cholesky Factorization on top of heterogeneous machines
In HeteroPar'2016 workshop of Euro-Par, Grenoble, France, August 2016
[WWW] [PDF] - Terry Cojean, Abdou Guermouche, Andra Hugo, Raymond Namyst, and Pierre-André Wacrenier
Resource aggregation for task-based Cholesky Factorization on top of modern architectures
Note: This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar 16 workshops, November 2016
[WWW] [PDF]
On Hierarchical Tasks
- Mathieu Faverge, Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Raymond Namyst, Samuel Thibault, and Pierre-André Wacrenier
Programming Heterogeneous Architectures Using Hierarchical Tasks
Concurrency and Computation: Practice and Experience, 2023
[WWW] [PDF] [doi:10.1002/cpe.7811] - Mathieu Faverge, Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Raymond Namyst, Samuel Thibault, and Pierre-André Wacrenier
Programming Heterogeneous Architectures Using Hierarchical Tasks
In HeteroPar 2022, Glasgow, United Kingdom, pages 12, August 2022
[WWW] [PDF] - Mathieu Faverge, Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Samuel Thibault, and Pierre-André Wacrenier
Programmation des architectures hétérogènes à l'aide de tâches hiérarchiques
In COMPAS 2022 - Conférence francophone d'informatique en Parallélisme, Architecture et Système, Amiens, France, July 2022
[WWW] [PDF] - Mathieu Faverge, Nathalie Furmento, Gwenolé Lucas, Abdou Guermouche, Raymond Namyst, Samuel Thibault, and Pierre-André Wacrenier
Programming Heterogeneous Architectures Using Hierarchical Tasks
Research Report RR-9466, Inria Bordeaux Sud-Ouest, March 2022
[WWW] [PDF] - Arthur Chevalier
Critical resources management and scheduling under StarPU
Master Thesis, Université de Bordeaux, September 2017
[WWW] [PDF]
On Scheduling
- Maxime Gonthier, Loris Marchal, and Samuel Thibault
Taming data locality for task scheduling under memory constraint in runtime systems
Future Generation Computer Systems, 2023
[WWW] [PDF] [doi:10.1016/j.future.2023.01.024] - Maxime Gonthier, Samuel Thibault, and Loris Marchal
Memory-Aware Scheduling of Tasks Sharing Data on Multiple GPUs with Dynamic Runtime Systems
In IPDPS 2022 - 36th IEEE International Parallel & Distributed Processing Symposium, Lyon, France, May 2022
IEEE
[WWW] [PDF] [doi:10.1109/IPDPS53621.2022.00073] - Maxime Gonthier, Loris Marchal, and Samuel Thibault
Locality-Aware Scheduling of Independent Tasks for Runtime Systems
In COLOC: 5th workshop on data locality - 7th International European Conference on Parallel and Distributed Computing Workshops, Lisbon, Portugal, August 2021
[WWW] [PDF] [doi:10.1007/978-3-031-06156-1_1] - Vinicius Garcia Pinto, Lucas Leandro Nesi, Marcelo Cogo Miletto, and Lucas Mello Schnorr
Providing In-depth Performance Analysis for Heterogeneous Task-based Applications with StarVZ
In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2021
[WWW] - Maxime Gonthier, Loris Marchal, and Samuel Thibault
Locality-Aware Scheduling of Independant Tasks for Runtime Systems
Research Report RR-9394, Inria, 2021
[WWW] [PDF] - Bérenger Bramas
Impact study of data locality on task-based applications through the Heteroprio scheduler
PeerJ Computer Science, May 2019
[WWW] [PDF] [doi:10.7717/peerj-cs.190] - Lucas Leandro Nesi, Samuel Thibault, Luka Stanisic, and Lucas Mello Schnorr
Visual Performance Analysis of Memory Behavior in a Task-Based Runtime on Hybrid Platforms
In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Larnaca, Cyprus, pages 142-151, May 2019
IEEE
[WWW] [PDF] [doi:10.1109/CCGRID.2019.00025] - Christophe Alias, Samuel Thibault, and Laure Gonnord
A Compiler Algorithm to Guide Runtime Scheduling
Research Report RR-9315, INRIA Grenoble ; INRIA Bordeaux, December 2019
[WWW] [PDF] - Vinicius Garcia Pinto, Lucas Mello Schnorr, Luka Stanisic, Arnaud Legrand, Samuel Thibault, and Vincent Danjean
A Visual Performance Analysis Framework for Task-based Parallel Applications running on Hybrid Clusters
CCPE - Concurrency and Computation: Practice and Experience, 30, April 2018
[WWW] [PDF] [doi:10.1002/cpe.4472] - Vinicius Garcia Pinto, Lucas Mello Schnorr, Arnaud Legrand, Samuel Thibault, Luka Stanisic, and Vincent Danjean
Detecção de Anomalias de Desempenho em Aplicações de Alto Desempenho baseadas em Tarefas em Clusters Hìbridos
In WPerformance - 17o Workshop em Desempenho de Sistemas Computacionais e de Comunicação, Natal, Brazil, July 2018
[WWW] [PDF] - Suraj Kumar
Scheduling of Dense Linear Algebra Kernels on Heterogeneous Resources
PhD thesis, Université de Bordeaux, April 2017
[WWW] [PDF] - O. Beaumont, L. Eyraud-Dubois, and S. Kumar
Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs
In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 768-777, May 2017
[WWW] [PDF] [doi:10.1109/IPDPS.2017.71] - Emmanuel Agullo, Olivier Beaumont, Lionel Eyraud-Dubois, and Suraj Kumar
Are Static Schedules so Bad ? A Case Study on Cholesky Factorization
In Proceedings of the 30th IEEE International Parallel & Distributed Processing Symposium, IPDPS'16, Chicago, IL, USA, May 2016
IEEE
[WWW] [PDF] - Vinicius Garcia Pinto, Luka Stanisic, Arnaud Legrand, Lucas Mello Schnorr, Samuel Thibault, and Vincent Danjean
Analyzing Dynamic Task-Based Applications on Hybrid Platforms: An Agile Scripting Approach
In VPA - 3rd Workshop on Visual Performance Analysis, Salt Lake City, USA, November 2016
Note: Held in conjunction with SC16
[WWW] [PDF] [doi:10.1109/VPA.2016.008] - Johan Janzén, David Black-Schaffer, and Andra Hugo
Partitioning GPUs for Improved Scalability
In IEEE 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), October 2016
[WWW] [doi:10.1109/SBAC-PAD.2016.14] - Emmanuel Agullo, Olivier Beaumont, Lionel Eyraud-Dubois, Julien Herrmann, Suraj Kumar, Loris Marchal, and Samuel Thibault
Bridging the Gap between Performance and Bounds of Cholesky Factorization on Heterogeneous Platforms
In HCW'2015 - Heterogeneity in Computing Workshop of IPDPS, Hyderabad, India, May 2015
[WWW] [PDF] [doi:10.1109/IPDPSW.2015.35] - Marc Sergent and Simon Archipoff
Modulariser les ordonnanceurs de tâches : une approche structurelle
In Compas'2014, Neuchâtel, Suisse, April 2014
[WWW] [PDF] - Cédric Augonnet, Jérôme Clet-Ortega, Samuel Thibault, and Raymond Namyst
Data-Aware Task Scheduling on Multi-Accelerator based Platforms
In The 16th International Conference on Parallel and Distributed Systems (ICPADS), Shanghai, China, December 2010
[WWW] [PDF] [doi:10.1109/ICPADS.2010.129]
On The C Extensions
On OpenMP Support on top of StarPU
- Emmanuel Agullo, Olivier Aumage, Berenger Bramas, Olivier Coulaud, and Samuel Pitoiset
Bridging the gap between OpenMP and task-based runtime systems for the fast multipole method
IEEE Transactions on Parallel and Distributed Systems, April 2017
[WWW] [PDF] [doi:10.1109/TPDS.2017.2697857] - Emmanuel Agullo, Olivier Aumage, Berenger Bramas, Olivier Coulaud, and Samuel Pitoiset
Bridging the gap between OpenMP 4.0 and native runtime systems for the fast multipole method
Research Report RR-8953, Inria, March 2016
[WWW] [PDF] - Philippe Virouleau, Pierrick Brunet, François Broquedis, Nathalie Furmento, Samuel Thibault, Olivier Aumage, and Thierry Gautier
Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite
In IWOMP2014 - 10th International Workshop on OpenMP, Salvador, Brazil, pages 16 - 29, September 2014
Springer
[WWW] [PDF] [doi:10.1007/978-3-319-11454-5_2]
On MPI Support
- Emmanuel Agullo, Alfredo Buttari, Abdou Guermouche, Julien Herrmann, and Antoine Jego
Task-based parallel programming for scalable matrix product algorithms
ACM Transactions on Mathematical Software, 2023
[WWW] [PDF] [doi:10.1145/3583560] - Philippe Swartvagher
On the Interactions between HPC Task-based Runtime Systems and Communication Libraries
Theses, Université de Bordeaux, November 2022
[WWW] [PDF] - Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M Ciorba, Nathan Debardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N Gansterer, Luc Giraud, Dominik Göddeke, Marco Heisig, Fabienne Jézéquel, Nils Kohl, Sherry Xiaoye, Romain Lion, Miriam Mehl, Paul Mycek, Michael Obersteiner, Enrique S Quintana-Ortì, Francesco Rizzi, Ulrich Rüde, Martin Schulz, Fred Fung, Robert Speck, Linda Stals, Keita Teranishi, Samuel Thibault, Dominik Thönnes, Andreas Wagner, and Barbara Wohlmuth
Resiliency in numerical algorithm design for extreme scale simulations
International Journal of High Performance Computing Applications, September 2021
[WWW] [PDF] - Alexandre Denis, Emmanuel Jeannot, Philippe Swartvagher, and Samuel Thibault
Using Dynamic Broadcasts to improve Task-Based Runtime Performances
In Euro-Par - 26th International European Conference on Parallel and Distributed Computing, Warsaw, Poland, August 2020
Rzadca and Malawski, Springer
[WWW] [PDF] [doi:10.1007/978-3-030-57675-2_28] - Romain Lion and Samuel Thibault
From tasks graphs to asynchronous distributed checkpointing with local restart
In 2020 IEEE/ACM 10th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), Atlanta, USA, November 2020
[WWW] [PDF] [doi:10.1109/FTXS51974.2020.00009] - Romain Lion
Tolérance aux pannes dans l'exécution distribuée de graphes de tâches
In Conférence d'informatique en Parallélisme, Architecture et Système, Anglet, France, June 2019
[WWW] [PDF] - Emmanuel Agullo, Olivier Aumage, Mathieu Faverge, Nathalie Furmento, Florent Pruvost, Marc Sergent, and Samuel Thibault
Achieving High Performance on Supercomputers with a Sequential Task-based Programming Model
TPDS - IEEE Transactions on Parallel and Distributed Systems, December 2017
[WWW] [PDF] [doi:10.1109/TPDS.2017.2766064] - Marc Sergent
Scalability of a task-based runtime system for dense linear algebra applications
PhD thesis, Université de Bordeaux, December 2016
[WWW] [PDF] - Emmanuel Agullo, Olivier Aumage, Mathieu Faverge, Nathalie Furmento, Florent Pruvost, Marc Sergent, and Samuel Thibault
Harnessing clusters of hybrid nodes with a sequential task-based programming model
In 8th International Workshop on Parallel Matrix Algorithms and Applications, July 2014
[WWW] [PDF] - Cédric Augonnet, Olivier Aumage, Nathalie Furmento, Samuel Thibault, and Raymond Namyst
StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators
Research Report RR-8538, INRIA, May 2014
[WWW] [PDF] - Cédric Augonnet, Olivier Aumage, Nathalie Furmento, Raymond Namyst, and Samuel Thibault
StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators
In Siegfried Benkner Jesper Larsson Träff and Jack Dongarra, editors, EuroMPI 2012, volume 7490 of LNCS, September 2012
Springer
Note: Poster Session
[WWW] [PDF]
On Memory Control
- Arthur Chevalier
Critical resources management and scheduling under StarPU
Master Thesis, Université de Bordeaux, September 2017
[WWW] [PDF] - Marc Sergent, David Goudin, Samuel Thibault, and Olivier Aumage
Controlling the Memory Subscription of Distributed Applications with a Task-Based Runtime System
In HIPS - 21st International Workshop on High-Level Parallel Programming Models and Supportive Environments, Chicago, USA, May 2016
[WWW] [PDF] [doi:10.1109/IPDPSW.2016.105]
On Performance Model Tuning
- Emmanuel Agullo, Bérenger Bramas, Olivier Coulaud, Luka Stanisic, and Samuel Thibault
Modeling Irregular Kernels of Task-based codes: Illustration with the Fast Multipole Method
Research Report RR-9036, INRIA Bordeaux, February 2017
[WWW] [PDF] - Cédric Augonnet, Samuel Thibault, and Raymond Namyst
Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures
In HPPC - Proceedings of the International Euro-Par Workshops, Highly Parallel Processing on a Chip, volume 6043 of LNCS, Delft, The Netherlands, pages 56-65, August 2009
Springer
[WWW] [PDF] [doi:10.1007/978-3-642-14122-5_9]
On The Simulation Support through SimGrid
- Idriss Daoudi, Philippe Virouleau, Thierry Gautier, Samuel Thibault, and Olivier Aumage
sOMP: Simulating OpenMP Task-Based Applications with NUMA Effects
In IWOMP 2020 - 16th International Workshop on OpenMP, volume 12295 of LNCS, Austin, USA, September 2020
Springer
[WWW] [PDF] [doi:10.1007/978-3-030-58144-2_13] - Samuel Thibault, Luka Stanisic, and Arnaud Legrand
Faithful Performance Prediction of a Dynamic Task-based Runtime System, an Opportunity for Task Graph Scheduling
In SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP 2020), Seattle, USA, February 2020
[WWW] [PDF] - Luka Stanisic, Samuel Thibault, Arnaud Legrand, Brice Videau, and Jean-François Méhaut
Faithful Performance Prediction of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures
CCPE - Concurrency and Computation: Practice and Experience, pp 16, May 2015
[WWW] [PDF] [doi:10.1002/cpe.3555] - Luka Stanisic, Emmanuel Agullo, Alfredo Buttari, Abdou Guermouche, Arnaud Legrand, Florent Lopez, and Brice Videau
Fast and Accurate Simulation of Multithreaded Sparse Linear Algebra Solvers
In The 21st IEEE International Conference on Parallel and Distributed Systems, Melbourne, Australia, December 2015
[WWW] [PDF] [doi:10.1109/ICPADS.2015.67] - Luka Stanisic, Samuel Thibault, Arnaud Legrand, Brice Videau, and Jean-François Méhaut
Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures
In Euro-Par - 20th International Conference on Parallel Processing, Porto, Portugal, August 2014
Springer-Verlag
[WWW] [PDF] [doi:10.1007/978-3-319-09873-9_5]
On The Cell Support
- Cédric Augonnet, Samuel Thibault, Raymond Namyst, and Maik Nijhuis
Exploiting the Cell/BE architecture with the StarPU unified runtime system
In SAMOS Workshop - International Workshop on Systems, Architectures, Modeling, and Simulation, volume 5657 of LNCS, Samos, Greece, July 2009
[WWW] [PDF] [doi:10.1007/978-3-642-03138-0_36]
On Applications
- Emmanuel Agullo, Olivier Coulaud, Alexandre Denis, Mathieu Faverge, Alain A. Franc, Jean-Marc Frigerio, Nathalie Furmento, Samuel Thibault, Adrien Guilbaud, Emmanuel Jeannot, Romain Peressoni, and Florent Pruvost
Task-based randomized singular value decomposition and multidimensional scaling
Research Report 9482, Inria Bordeaux - Sud Ouest ; Inrae - BioGeCo, September 2022
[WWW] [PDF] - Lazaros Papadopoulos, Dimitrios Soudris, Christoph Kessler, August Ernstsson, Johan Ahlqvist, Nikos Vasilas, Athanasios I Papadopoulos, Panos Seferlis, Charles Prouveur, Matthieu Haefele, Samuel Thibault, Athanasios Salamanis, Theodoros Ioakimidis, and Dionysios Kehagias
EXA2PRO: A Framework for High Development Productivity on Heterogeneous Computing Systems
IEEE Transactions on Parallel and Distributed Systems, August 2021
[WWW] [PDF] [doi:10.1109/TPDS.2021.3104257] - Rafael Alvares da Silva Lopes, Samuel Thibault, and Alba Cristina Magalhães Alves de Melo
MASA-StarPU: Parallel Sequence Comparison with Multiple Scheduling Policies and Pruning
In SBAC-PAD 2020 - IEEE 32nd International Symposium on Computer Architecture and High Performance Computing, Porto, Portugal, September 2020
[WWW] [PDF] [doi:10.1109/SBAC-PAD49847.2020.00039] - Georgios Tzanos, Vineet Soni, Charles Prouveur, Matthieu Haefele, Stavroula Zouzoula, Lazaros Papadopoulos, Samuel Thibault, Nicolas Vandenbergen, Dirk Pleiter, and Dimitrios Soudris
Applying StarPU runtime system to scientific applications: Experiences and lessons learned
In Parallel Optimization using/for Multi and Many-core High Performance Computing (POMCO), Barcelona, Spain, December 2020
[WWW] [PDF] - A. AlOnazi, H. Ltaief, D. Keyes, I. Said, and Samuel Thibault
Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry
In 2019 IEEE International Conference on Cluster Computing (CLUSTER), Albuquerque, USA, pages 1-11, September 2019
IEEE
[WWW] [PDF] [doi:10.1109/CLUSTER.2019.8891054] - Essadki, Mohamed, Jung, Jonathan, Larat, Adam, Pelletier, Milan, and Perrier, Vincent
A Task-Driven Implementation of a Simple Numerical Solver for Hyperbolic Conservation Laws
ESAIM: ProcS, 63:228-247, 2018
[WWW] [PDF] [doi:10.1051/proc/201863228] - Dimitrios Soudris, Lazaros Papadopoulos, Christoph W Kessler, Dionysios D Kehagias, Athanasios Papadopoulos, Panos Seferlis, Alexander Chatzigeorgiou, Apostolos Ampatzoglou, Samuel Thibault, Raymond Namyst, Dirk Pleiter, Georgi Gaydadjiev, Tobias Becker, and Matthieu Haefele
EXA2PRO programming environment
In SAMOS XVIII: Architectures, Modeling, and Simulation, Pythagorion, Greece, pages 202-209, July 2018
ACM
[WWW] [PDF] [doi:10.1145/3229631.3239369] - Jean Marie Couteyen Carpaye, Jean Roman, and Pierre Brenner
Design and Analysis of a Task-based Parallelization over a Runtime System of an Explicit Finite-Volume CFD Code with Adaptive Time Stepping
International Journal of Computational Science and Engineering, pp 1 - 22, 2017
[WWW] [PDF] [doi:10.1016/j.jocs.2017.03.008] - Olivier Aumage, Julien Bigot, Hélène Coullon, Christian Pérez, and Jérôme Richard
Combining Both a Component Model and a Task-Based Model for HPC Applications: A Feasibility Study on GYSELA
In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pages 635-644, 2017
[doi:10.1109/CCGRID.2017.88] - Emmanuel Agullo, Alfredo Buttari, Mikko Byckling, Abdou Guermouche, and Ian Masliah
Achieving high-performance with a sparse direct solver on Intel KNL
Research Report RR-9035, Inria Bordeaux Sud-Ouest ; CNRS-IRIT ; Intel corporation ; Université Bordeaux, February 2017
[WWW] [PDF] - Nolwenn Balin, Guillaume Sylvand, and Jérôme Robert
Fast methods applied to BEM solvers for acoustic propagation problems
In 22nd AIAA/CEAS Aeroacoustics Conference, pages 2712, 2016 - Emmanuel Agullo, Bérenger Bramas, Olivier Coulaud, Martin Khannouz, and Luka Stanisic
Task-based fast multipole method for clusters of multicore processors
Research Report RR-8970, Inria Bordeaux Sud-Ouest, October 2016
[WWW] [PDF] - E Agullo, L Giraud, A Guermouche, S Nakov, and Jean Roman
Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures
Research Report 8912, Inria Bordeaux Sud-Ouest, May 2016
[WWW] [PDF] - Corentin Rossignon
A fine grain model programming for parallelization of sparse linear solver
PhD thesis, Université de Bordeaux, July 2015
[WWW] [PDF] - Vìctor Martìnez, David Michéa, Fabrice Dupros, Olivier Aumage, Samuel Thibault, Hideo Aochi, and Philippe Olivier Alexandre Navaux
Towards seismic wave modeling on heterogeneous many-core architectures using task-based runtime system
In SBAC-PAD - 27th International Symposium on Computer Architecture and High Performance Computing, Florianopolis, Brazil, October 2015
[WWW] [PDF] [doi:10.1109/SBAC-PAD.2015.33] - Emmanuel Agullo, Bérenger Bramas, Olivier Coulaud, Eric Darve, Matthias Messner, and Toru Takahashi
Task-Based FMM for Multicore Architectures
SIAM Journal on Scientific Computing, 36(1):66-93, 2014
[WWW] [PDF] [doi:10.1137/130915662] - Sylvain Henry, Alexandre Denis, Denis Barthou, Marie-Christine Counilh, and Raymond Namyst
Toward OpenCL Automatic Multi-Device Support
In Fernando Silva, Ines Dutra, and Vitor Santos Costa, editors, Euro-Par 2014, Porto, Portugal, August 2014
Springer
[WWW] [PDF] - Xavier Lacoste, Mathieu Faverge, Pierre Ramet, Samuel Thibault, and George Bosilca
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
In HCW'2014 - Heterogeneity in Computing Workshop of IPDPS, Phoenix, USA, May 2014
IEEE
Note: RR-8446
[WWW] [PDF] [doi:10.1109/IPDPSW.2014.9] - Emmanuel Agullo, Berenger Bramas, Olivier Coulaud, Eric Darve, Matthias Messner, and Toru Takahashi
Task-based FMM for heterogeneous architectures
Research Report RR-8513, Inria Bordeaux - Sud-Ouest, April 2014
[WWW] [PDF] - Xavier Lacoste, Mathieu Faverge, Pierre Ramet, Samuel Thibault, and George Bosilca
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
Research Report RR-8446, INRIA, January 2014
[WWW] [PDF] - Emmanuel Agullo, Olivier Aumage, Mathieu Faverge, Nathalie Furmento, Florent Pruvost, Marc Sergent, and Samuel Thibault
Overview of Distributed Linear Algebra on Hybrid Nodes over the StarPU Runtime
SIAM Conference on Parallel Processing for Scientific Computing, February 2014
[WWW] [PDF] - Cyril Bordage
Ordonnancement dynamique, adapté aux architectures hétérogènes, de la méthode multipôle pour les équations de Maxwell, en électromagnétisme
PhD thesis, Université de Bordeaux, December 2013
[WWW] [PDF] - Sylvain Henry
Modèles de programmation et supports exécutifs pour architectures hétérogènes
PhD thesis, Université de Bordeaux, November 2013
[WWW] [PDF] - Sylvain Henry
ViperVM: a Runtime System for Parallel Functional High-Performance Computing on Heterogeneous Architectures
In 2nd Workshop on Functional High-Performance Computing (FHPC'13), Boston, USA, September 2013
[WWW] [PDF] - Tetsuya Odajima, Taisuke Boku, Mitsuhisa Sato, Toshihiro Hanawa, Yuetsu Kodama, Raymond Namyst, Samuel Thibault, and Olivier Aumage
Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing
In ICA3PP-2013 - The 13th International Conference on Algorithms and Architectures for Parallel Processing, Vietri sul Mare, Italy, December 2013
[WWW] [PDF] [doi:10.1007/978-3-319-03889-6_7] - Satoshi Ohshima, Satoshi Katagiri, Kengo Nakajima, Samuel Thibault, and Raymond Namyst
Implementation of FEM Application on GPU with StarPU
In SIAM CSE13 - SIAM Conference on Computational Science and Engineering 2013, Boston, USA, February 2013
SIAM
[WWW] - Corentin Rossignon
Optimisation du produit matrice-vecteur creux sur architecture GPU pour un simulateur de reservoir
In 21èmes Rencontres Francophones du Parallélisme (RenPar'21), Grenoble, France, January 2013
[WWW] [PDF] - Corentin Rossignon, Pascal Hénon, Olivier Aumage, and Samuel Thibault
A NUMA-aware fine grain parallelization framework for multi-core architecture
In PDSEC - 14th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing - 2013, Boston, USA, May 2013
[WWW] [PDF] - Sylvain Henry, Alexandre Denis, and Denis Barthou
Programmation unifiée multi-accélérateur OpenCL
Techniques et Sciences Informatiques, (8-9-10):1233-1249, 2012
[WWW] [PDF] - Sidi Ahmed Mahmoudi, Pierre Manneback, Cédric Augonnet, and Samuel Thibault
Traitements d'Images sur Architectures Parallèles et Hétérogènes
Technique et Science Informatiques, 31(8-10):1183-1203, 2012
[WWW] [PDF] [doi:10.3166/tsi.31.1183-1203] - Siegfried Benkner, Enes Bajrovic, Erich Marth, Martin Sandrieser, Raymond Namyst, and Samuel Thibault
High-Level Support for Pipeline Parallelism on Many-Core Architectures
In Euro-Par - 18th International Conference on Parallel Processing, Rhodes Island, Greece, August 2012
[WWW] [PDF] [doi:10.1007/978-3-642-32820-6_61] - Christoph Kessler, Usman Dastgeer, Samuel Thibault, Raymond Namyst, Andrew Richards, Uwe Dolinsky, Siegfried Benkner, Jesper Larsson Träff, and Sabri Pllana
Programmability and Performance Portability Aspects of Heterogeneous Multi-/Manycore Systems
In DATE - Design, Automation and Test in Europe, Dresden, Deutschland, March 2012
ISBN: 978-3-9810801-8-6
[WWW] [PDF] [doi:10.1109/DATE.2012.6176582] - Siegfried Benkner, Sabri Pllana, Jesper Larsson Träff, Philippas Tsigas, Uwe Dolinsky, Cédric Augonnet, Beverly Bachmayer, Christoph Kessler, David Moloney, and Vitaly Osipov
PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems
IEEE Micro, 31(5):28-41, September 2011
ISSN: 0272-1732
[WWW] [PDF] [doi:10.1109/MM.2011.67] - Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Mathieu Faverge, Julien Langou, Hatem Ltaief, and Stanimire Tomov
LU factorization for accelerator-based systems
In 9th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 11), Sharm El-Sheikh, Egypt, June 2011
[WWW] [PDF] - Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Mathieu Faverge, Hatem Ltaief, Samuel Thibault, and Stanimire Tomov
QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators
In 25th IEEE International Parallel & Distributed Processing Symposium (IEEE IPDPS 2011), Anchorage, Alaska, USA, May 2011
[WWW] [PDF] [doi:10.1109/IPDPS.2011.90] - Usman Dastgeer, Christoph Kessler, and Samuel Thibault
Flexible runtime support for efficient skeleton programming on hybrid systems
In ParCo - Proceedings of the International Conference on Parallel Computing, volume 22 of Advances of Parallel Computing, Gent, Belgium, pages 159-166, August 2011
[WWW] [PDF] [doi:10.3233/978-1-61499-041-3-159] - Sylvain Henry
Programmation multi-accélérateurs unifiée en OpenCL
In 20èmes Rencontres Francophones du Parallélisme (RenPar'20), Saint Malo, France, May 2011
[WWW] [PDF] - Sidi Ahmed Mahmoudi, Pierre Manneback, Cédric Augonnet, and Samuel Thibault
Détection optimale des coins et contours dans des bases d'images volumineuses sur architectures multicoeurs hétérogènes
In RenPar'20 - 20èmes Rencontres Francophones du Parallélisme, Saint-Malo, France, May 2011
[WWW] [PDF] - Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Hatem Ltaief, Raymond Namyst, Samuel Thibault, and Stanimire Tomov
A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs
In Wen-mei W. Hwu, editor, GPU Computing Gems, volume 2
Morgan Kaufmann, September 2010
[WWW] [PDF] [doi:10.1016/B978-0-12-385963-1.00034-4] - Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Hatem Ltaief, Raymond Namyst, Jean Roman, Samuel Thibault, and Stanimire Tomov
Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators
In SAAHPC - Symposium on Application Accelerators in High Performance Computing, Knoxville, USA, July 2010
[WWW] [PDF]