StarPU Publications

Table of Contents

All StarPU related publications are also listed here with the corresponding Bibtex entries.

A good overview is available in the following Research Report 'StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines'.

If you need to cite StarPU, please reference StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures for a general presentation. Other sub-sections below will give you references for more specific aspects of StarPU.

General Presentations

  • Samuel Thibault
    On Runtime Systems for Task-based Programming on Heterogeneous Platforms
    Habilitation à diriger des recherches, Université de Bordeaux, December 2018
    [WWW] [PDF]
  • Cédric Augonnet
    Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System's Perspective
    PhD thesis, Université de Bordeaux, December 2011
    [WWW] [PDF]
  • Cédric Augonnet, Samuel Thibault, Raymond Namyst, and Pierre-André Wacrenier
    StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
    CCPE - Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, 23:187-198, February 2011
    [WWW] [PDF] [doi:10.1002/cpe.1631]
  • Cédric Augonnet, Samuel Thibault, and Raymond Namyst
    StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines
    Research Report RR-7240, INRIA, March 2010
    [WWW] [PDF]
  • Cédric Augonnet
    StarPU: un support exécutif unifié pour les architectures multicoeurs hétérogènes
    In 19èmes Rencontres Francophones du Parallélisme, Toulouse, France, September 2009
    Note: Best Paper Award
    [WWW] [PDF]
  • Cédric Augonnet, Samuel Thibault, Raymond Namyst, and Pierre-André Wacrenier
    StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
    In Euro-Par - 15th International Conference on Parallel Processing, volume 5704 of LNCS, Delft, The Netherlands, pages 863-874, August 2009
    Springer
    [WWW] [PDF] [doi:10.1007/978-3-642-03869-3_80]
  • Cédric Augonnet
    Vers des supports d'exécution capables d'exploiter les machines multicoeurs hétérogènes
    Master Thesis, Université de Bordeaux, June 2008
    [WWW] [PDF]
  • Cédric Augonnet and Raymond Namyst
    A unified runtime system for heterogeneous multicore architectures
    In Proceedings of the International Euro-Par Workshops 2008, HPPC'08, volume 5415 of LNCS, Las Palmas de Gran Canaria, Spain, pages 174-183, August 2008
    Springer
    ISBN: 978-3-642-00954-9
    [WWW] [PDF] [doi:10.1007/978-3-642-00955-6_22]

On Composability

  • Andra-Ecaterina Hugo
    Composability of parallel codes on heterogeneous architectures
    Ph.D Thesis, Université de Bordeaux, December 2014
    [WWW] [PDF]
  • Andra Hugo
    Le problème de la composition parallèle : une approche supervisée
    In 21èmes Rencontres Francophones du Parallélisme (RenPar'21), Grenoble, France, January 2013
    [WWW] [PDF]
  • Andra Hugo, Abdou Guermouche, Raymond Namyst, and Pierre-André Wacrenier
    Composing multiple StarPU applications over heterogeneous machines: a supervised approach
    In Third International Workshop on Accelerators and Hybrid Exascale Systems, Boston, USA, May 2013
    [WWW] [PDF]
  • Andra Hugo
    Composabilité de codes parallèles sur architectures hétérogènes
    Master Thesis, Université de Bordeaux, June 2011
    [WWW] [PDF]

On Parallel Tasks

  • Terry Cojean
    Programmation of heterogeneous architectures using moldable tasks
    Ph.D Thesis, Université de Bordeaux, March 2018
    [WWW] [PDF]
  • Olivier Beaumont, Terry Cojean, Lionel Eyraud-Dubois, Abdou Guermouche, and Suraj Kumar
    Scheduling of Linear Algebra Kernels on Multiple Heterogeneous Resources
    In International Conference on High Performance Computing, Data, and Analytics (HiPC), Hyderabad, India, December 2016
    [WWW] [PDF]
  • Terry Cojean, Abdou Guermouche, Andra Hugo, Raymond Namyst, and Pierre-André Wacrenier
    Resource aggregation for task-based Cholesky Factorization on top of heterogeneous machines
    In HeteroPar'2016 workshop of Euro-Par, Grenoble, France, August 2016
    [WWW] [PDF]
  • Terry Cojean, Abdou Guermouche, Andra Hugo, Raymond Namyst, and Pierre-André Wacrenier
    Resource aggregation for task-based Cholesky Factorization on top of modern architectures
    Note: This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar 16 workshops, November 2016
    [WWW] [PDF]

On Hierarchical Tasks

  • Mathieu Faverge, Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Raymond Namyst, Samuel Thibault, and Pierre-André Wacrenier
    Programming Heterogeneous Architectures Using Hierarchical Tasks
    Concurrency and Computation: Practice and Experience, 2023
    [WWW] [PDF] [doi:10.1002/cpe.7811]
  • Mathieu Faverge, Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Raymond Namyst, Samuel Thibault, and Pierre-André Wacrenier
    Programming Heterogeneous Architectures Using Hierarchical Tasks
    In HeteroPar 2022, Glasgow, United Kingdom, pages 12, August 2022
    [WWW] [PDF]
  • Mathieu Faverge, Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Samuel Thibault, and Pierre-André Wacrenier
    Programmation des architectures hétérogènes à l'aide de tâches hiérarchiques
    In COMPAS 2022 - Conférence francophone d'informatique en Parallélisme, Architecture et Système, Amiens, France, July 2022
    [WWW] [PDF]
  • Mathieu Faverge, Nathalie Furmento, Gwenolé Lucas, Abdou Guermouche, Raymond Namyst, Samuel Thibault, and Pierre-André Wacrenier
    Programming Heterogeneous Architectures Using Hierarchical Tasks
    Research Report RR-9466, Inria Bordeaux Sud-Ouest, March 2022
    [WWW] [PDF]
  • Arthur Chevalier
    Critical resources management and scheduling under StarPU
    Master Thesis, Université de Bordeaux, September 2017
    [WWW] [PDF]

On Scheduling

  • Maxime Gonthier, Loris Marchal, and Samuel Thibault
    Taming data locality for task scheduling under memory constraint in runtime systems
    Future Generation Computer Systems, 2023
    [WWW] [PDF] [doi:10.1016/j.future.2023.01.024]
  • Maxime Gonthier, Samuel Thibault, and Loris Marchal
    Memory-Aware Scheduling of Tasks Sharing Data on Multiple GPUs with Dynamic Runtime Systems
    In IPDPS 2022 - 36th IEEE International Parallel & Distributed Processing Symposium, Lyon, France, May 2022
    IEEE
    [WWW] [PDF] [doi:10.1109/IPDPS53621.2022.00073]
  • Maxime Gonthier, Loris Marchal, and Samuel Thibault
    Locality-Aware Scheduling of Independent Tasks for Runtime Systems
    In COLOC: 5th workshop on data locality - 7th International European Conference on Parallel and Distributed Computing Workshops, Lisbon, Portugal, August 2021
    [WWW] [PDF] [doi:10.1007/978-3-031-06156-1_1]
  • Vinicius Garcia Pinto, Lucas Leandro Nesi, Marcelo Cogo Miletto, and Lucas Mello Schnorr
    Providing In-depth Performance Analysis for Heterogeneous Task-based Applications with StarVZ
    In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2021
    [WWW]
  • Maxime Gonthier, Loris Marchal, and Samuel Thibault
    Locality-Aware Scheduling of Independant Tasks for Runtime Systems
    Research Report RR-9394, Inria, 2021
    [WWW] [PDF]
  • Bérenger Bramas
    Impact study of data locality on task-based applications through the Heteroprio scheduler
    PeerJ Computer Science, May 2019
    [WWW] [PDF] [doi:10.7717/peerj-cs.190]
  • Lucas Leandro Nesi, Samuel Thibault, Luka Stanisic, and Lucas Mello Schnorr
    Visual Performance Analysis of Memory Behavior in a Task-Based Runtime on Hybrid Platforms
    In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Larnaca, Cyprus, pages 142-151, May 2019
    IEEE
    [WWW] [PDF] [doi:10.1109/CCGRID.2019.00025]
  • Christophe Alias, Samuel Thibault, and Laure Gonnord
    A Compiler Algorithm to Guide Runtime Scheduling
    Research Report RR-9315, INRIA Grenoble ; INRIA Bordeaux, December 2019
    [WWW] [PDF]
  • Vinicius Garcia Pinto, Lucas Mello Schnorr, Luka Stanisic, Arnaud Legrand, Samuel Thibault, and Vincent Danjean
    A Visual Performance Analysis Framework for Task-based Parallel Applications running on Hybrid Clusters
    CCPE - Concurrency and Computation: Practice and Experience, 30, April 2018
    [WWW] [PDF] [doi:10.1002/cpe.4472]
  • Vinicius Garcia Pinto, Lucas Mello Schnorr, Arnaud Legrand, Samuel Thibault, Luka Stanisic, and Vincent Danjean
    Detecção de Anomalias de Desempenho em Aplicações de Alto Desempenho baseadas em Tarefas em Clusters Hìbridos
    In WPerformance - 17o Workshop em Desempenho de Sistemas Computacionais e de Comunicação, Natal, Brazil, July 2018
    [WWW] [PDF]
  • Suraj Kumar
    Scheduling of Dense Linear Algebra Kernels on Heterogeneous Resources
    PhD thesis, Université de Bordeaux, April 2017
    [WWW] [PDF]
  • O. Beaumont, L. Eyraud-Dubois, and S. Kumar
    Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs
    In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 768-777, May 2017
    [WWW] [PDF] [doi:10.1109/IPDPS.2017.71]
  • Emmanuel Agullo, Olivier Beaumont, Lionel Eyraud-Dubois, and Suraj Kumar
    Are Static Schedules so Bad ? A Case Study on Cholesky Factorization
    In Proceedings of the 30th IEEE International Parallel & Distributed Processing Symposium, IPDPS'16, Chicago, IL, USA, May 2016
    IEEE
    [WWW] [PDF]
  • Vinicius Garcia Pinto, Luka Stanisic, Arnaud Legrand, Lucas Mello Schnorr, Samuel Thibault, and Vincent Danjean
    Analyzing Dynamic Task-Based Applications on Hybrid Platforms: An Agile Scripting Approach
    In VPA - 3rd Workshop on Visual Performance Analysis, Salt Lake City, USA, November 2016
    Note: Held in conjunction with SC16
    [WWW] [PDF] [doi:10.1109/VPA.2016.008]
  • Johan Janzén, David Black-Schaffer, and Andra Hugo
    Partitioning GPUs for Improved Scalability
    In IEEE 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), October 2016
    [WWW] [doi:10.1109/SBAC-PAD.2016.14]
  • Emmanuel Agullo, Olivier Beaumont, Lionel Eyraud-Dubois, Julien Herrmann, Suraj Kumar, Loris Marchal, and Samuel Thibault
    Bridging the Gap between Performance and Bounds of Cholesky Factorization on Heterogeneous Platforms
    In HCW'2015 - Heterogeneity in Computing Workshop of IPDPS, Hyderabad, India, May 2015
    [WWW] [PDF] [doi:10.1109/IPDPSW.2015.35]
  • Marc Sergent and Simon Archipoff
    Modulariser les ordonnanceurs de tâches : une approche structurelle
    In Compas'2014, Neuchâtel, Suisse, April 2014
    [WWW] [PDF]
  • Cédric Augonnet, Jérôme Clet-Ortega, Samuel Thibault, and Raymond Namyst
    Data-Aware Task Scheduling on Multi-Accelerator based Platforms
    In The 16th International Conference on Parallel and Distributed Systems (ICPADS), Shanghai, China, December 2010
    [WWW] [PDF] [doi:10.1109/ICPADS.2010.129]

On The C Extensions

  • Ludovic Courtès
    C Language Extensions for Hybrid CPU/GPU Programming with StarPU
    Research Report RR-8278, INRIA, April 2013
    [WWW] [PDF]

On OpenMP Support on top of StarPU

  • Emmanuel Agullo, Olivier Aumage, Berenger Bramas, Olivier Coulaud, and Samuel Pitoiset
    Bridging the gap between OpenMP and task-based runtime systems for the fast multipole method
    IEEE Transactions on Parallel and Distributed Systems, April 2017
    [WWW] [PDF] [doi:10.1109/TPDS.2017.2697857]
  • Emmanuel Agullo, Olivier Aumage, Berenger Bramas, Olivier Coulaud, and Samuel Pitoiset
    Bridging the gap between OpenMP 4.0 and native runtime systems for the fast multipole method
    Research Report RR-8953, Inria, March 2016
    [WWW] [PDF]
  • Philippe Virouleau, Pierrick Brunet, François Broquedis, Nathalie Furmento, Samuel Thibault, Olivier Aumage, and Thierry Gautier
    Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite
    In IWOMP2014 - 10th International Workshop on OpenMP, Salvador, Brazil, pages 16 - 29, September 2014
    Springer
    [WWW] [PDF] [doi:10.1007/978-3-319-11454-5_2]

On MPI Support

  • Emmanuel Agullo, Alfredo Buttari, Abdou Guermouche, Julien Herrmann, and Antoine Jego
    Task-based parallel programming for scalable matrix product algorithms
    ACM Transactions on Mathematical Software, 2023
    [WWW] [PDF] [doi:10.1145/3583560]
  • Philippe Swartvagher
    On the Interactions between HPC Task-based Runtime Systems and Communication Libraries
    Theses, Université de Bordeaux, November 2022
    [WWW] [PDF]
  • Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M Ciorba, Nathan Debardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N Gansterer, Luc Giraud, Dominik Göddeke, Marco Heisig, Fabienne Jézéquel, Nils Kohl, Sherry Xiaoye, Romain Lion, Miriam Mehl, Paul Mycek, Michael Obersteiner, Enrique S Quintana-Ortì, Francesco Rizzi, Ulrich Rüde, Martin Schulz, Fred Fung, Robert Speck, Linda Stals, Keita Teranishi, Samuel Thibault, Dominik Thönnes, Andreas Wagner, and Barbara Wohlmuth
    Resiliency in numerical algorithm design for extreme scale simulations
    International Journal of High Performance Computing Applications, September 2021
    [WWW] [PDF]
  • Alexandre Denis, Emmanuel Jeannot, Philippe Swartvagher, and Samuel Thibault
    Using Dynamic Broadcasts to improve Task-Based Runtime Performances
    In Euro-Par - 26th International European Conference on Parallel and Distributed Computing, Warsaw, Poland, August 2020
    Rzadca and Malawski, Springer
    [WWW] [PDF] [doi:10.1007/978-3-030-57675-2_28]
  • Romain Lion and Samuel Thibault
    From tasks graphs to asynchronous distributed checkpointing with local restart
    In 2020 IEEE/ACM 10th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS), Atlanta, USA, November 2020
    [WWW] [PDF] [doi:10.1109/FTXS51974.2020.00009]
  • Romain Lion
    Tolérance aux pannes dans l'exécution distribuée de graphes de tâches
    In Conférence d'informatique en Parallélisme, Architecture et Système, Anglet, France, June 2019
    [WWW] [PDF]
  • Emmanuel Agullo, Olivier Aumage, Mathieu Faverge, Nathalie Furmento, Florent Pruvost, Marc Sergent, and Samuel Thibault
    Achieving High Performance on Supercomputers with a Sequential Task-based Programming Model
    TPDS - IEEE Transactions on Parallel and Distributed Systems, December 2017
    [WWW] [PDF] [doi:10.1109/TPDS.2017.2766064]
  • Marc Sergent
    Scalability of a task-based runtime system for dense linear algebra applications
    PhD thesis, Université de Bordeaux, December 2016
    [WWW] [PDF]
  • Emmanuel Agullo, Olivier Aumage, Mathieu Faverge, Nathalie Furmento, Florent Pruvost, Marc Sergent, and Samuel Thibault
    Harnessing clusters of hybrid nodes with a sequential task-based programming model
    In 8th International Workshop on Parallel Matrix Algorithms and Applications, July 2014
    [WWW] [PDF]
  • Cédric Augonnet, Olivier Aumage, Nathalie Furmento, Samuel Thibault, and Raymond Namyst
    StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators
    Research Report RR-8538, INRIA, May 2014
    [WWW] [PDF]
  • Cédric Augonnet, Olivier Aumage, Nathalie Furmento, Raymond Namyst, and Samuel Thibault
    StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators
    In Siegfried Benkner Jesper Larsson Träff and Jack Dongarra, editors, EuroMPI 2012, volume 7490 of LNCS, September 2012
    Springer
    Note: Poster Session
    [WWW] [PDF]

On Memory Control

  • Arthur Chevalier
    Critical resources management and scheduling under StarPU
    Master Thesis, Université de Bordeaux, September 2017
    [WWW] [PDF]
  • Marc Sergent, David Goudin, Samuel Thibault, and Olivier Aumage
    Controlling the Memory Subscription of Distributed Applications with a Task-Based Runtime System
    In HIPS - 21st International Workshop on High-Level Parallel Programming Models and Supportive Environments, Chicago, USA, May 2016
    [WWW] [PDF] [doi:10.1109/IPDPSW.2016.105]

On Performance Model Tuning

  • Emmanuel Agullo, Bérenger Bramas, Olivier Coulaud, Luka Stanisic, and Samuel Thibault
    Modeling Irregular Kernels of Task-based codes: Illustration with the Fast Multipole Method
    Research Report RR-9036, INRIA Bordeaux, February 2017
    [WWW] [PDF]
  • Cédric Augonnet, Samuel Thibault, and Raymond Namyst
    Automatic Calibration of Performance Models on Heterogeneous Multicore Architectures
    In HPPC - Proceedings of the International Euro-Par Workshops, Highly Parallel Processing on a Chip, volume 6043 of LNCS, Delft, The Netherlands, pages 56-65, August 2009
    Springer
    [WWW] [PDF] [doi:10.1007/978-3-642-14122-5_9]

On The Simulation Support through SimGrid

  • Idriss Daoudi, Philippe Virouleau, Thierry Gautier, Samuel Thibault, and Olivier Aumage
    sOMP: Simulating OpenMP Task-Based Applications with NUMA Effects
    In IWOMP 2020 - 16th International Workshop on OpenMP, volume 12295 of LNCS, Austin, USA, September 2020
    Springer
    [WWW] [PDF] [doi:10.1007/978-3-030-58144-2_13]
  • Samuel Thibault, Luka Stanisic, and Arnaud Legrand
    Faithful Performance Prediction of a Dynamic Task-based Runtime System, an Opportunity for Task Graph Scheduling
    In SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP 2020), Seattle, USA, February 2020
    [WWW] [PDF]
  • Luka Stanisic, Samuel Thibault, Arnaud Legrand, Brice Videau, and Jean-François Méhaut
    Faithful Performance Prediction of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures
    CCPE - Concurrency and Computation: Practice and Experience, pp 16, May 2015
    [WWW] [PDF] [doi:10.1002/cpe.3555]
  • Luka Stanisic, Emmanuel Agullo, Alfredo Buttari, Abdou Guermouche, Arnaud Legrand, Florent Lopez, and Brice Videau
    Fast and Accurate Simulation of Multithreaded Sparse Linear Algebra Solvers
    In The 21st IEEE International Conference on Parallel and Distributed Systems, Melbourne, Australia, December 2015
    [WWW] [PDF] [doi:10.1109/ICPADS.2015.67]
  • Luka Stanisic, Samuel Thibault, Arnaud Legrand, Brice Videau, and Jean-François Méhaut
    Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures
    In Euro-Par - 20th International Conference on Parallel Processing, Porto, Portugal, August 2014
    Springer-Verlag
    [WWW] [PDF] [doi:10.1007/978-3-319-09873-9_5]

On The Cell Support

  • Cédric Augonnet, Samuel Thibault, Raymond Namyst, and Maik Nijhuis
    Exploiting the Cell/BE architecture with the StarPU unified runtime system
    In SAMOS Workshop - International Workshop on Systems, Architectures, Modeling, and Simulation, volume 5657 of LNCS, Samos, Greece, July 2009
    [WWW] [PDF] [doi:10.1007/978-3-642-03138-0_36]

Papers related to StarPU

  • Lucas Leandro Nesi, Vinicius Garcia Pinto, Lucas Mello Schnorr, and Arnaud Legrand
    Summarizing task-based applications behavior over many nodes through progression clustering
    In PDP 2023 - 31st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, Naples, Italy, pages 1-8, March 2023
    [WWW] [PDF]
  • Marcelo Cogo Miletto, Lucas Leandro Nesi, Lucas Mello Schnorr, and Arnaud Legrand
    Performance Analysis of Irregular Task-Based Applications on Hybrid Platforms: Structure Matters
    Future Generation Computer Systems, 135, October 2022
    [WWW] [PDF]
  • Emmanuel Agullo, Alfredo Buttari, Abdou Guermouche, Julien Herrmann, and Antoine Jego
    Task-Based Parallel Programming for Scalable Algorithms: application to Matrix Multiplication
    Research Report 9461, Inria Bordeaux - Sud-Ouest, February 2022
    [WWW] [PDF]
  • Alexandre Denis, Emmanuel Jeannot, and Philippe Swartvagher
    Interferences between Communications and Computations in Distributed HPC Systems
    In ICPP 2021 - 50th International Conference on Parallel Processing, Chicago / Virtual, United States, pages 11, August 2021
    [WWW] [PDF] [doi:10.1145/3472456.3473516]
  • Elliott Slaughter, Wei Wu, Yuankun Fu, Legend Brandenburg, Nicolai Garcia, Wilhem Kautz, Emily Marx, Kaleb S. Morris, Qinglei Cao, George Bosilca, Seema Mirchandaney, Wonchan Lee, Sean Treichler, Patrick McCormick, and Alex Aiken
    Task Bench: A Parameterized Benchmark for Evaluating Parallel Runtime Performance
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '20, 2020
    IEEE Press
    ISBN: 9781728199986
    [WWW] [PDF] [doi:10.5555/3433701.3433783]
  • Peter Thoman, Kiril Dichev, Thomas Heller, Roman Iakymchuk, Xavier Aguilar, Khalid Hasanov, Philipp Gschwandtner, Pierre Lemarinier, Stefano Markidis, Herbert Jordan, and others
    A taxonomy of task-based parallel programming technologies for high-performance computing
    The Journal of Supercomputing, 74(4):1422-1434, 2018
    [WWW] [PDF] [doi:10.1007/s11227-018-2238-4]
  • I. D. Mironescu and L. Vintan
    Coloured Petri Net modelling of task scheduling on a heterogeneous computational node
    In 2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP), pages 323-330, 2014
    [WWW] [PDF] [doi:10.1109/ICCP.2014.6937016]

On Applications

  • Emmanuel Agullo, Olivier Coulaud, Alexandre Denis, Mathieu Faverge, Alain A. Franc, Jean-Marc Frigerio, Nathalie Furmento, Samuel Thibault, Adrien Guilbaud, Emmanuel Jeannot, Romain Peressoni, and Florent Pruvost
    Task-based randomized singular value decomposition and multidimensional scaling
    Research Report 9482, Inria Bordeaux - Sud Ouest ; Inrae - BioGeCo, September 2022
    [WWW] [PDF]
  • Lazaros Papadopoulos, Dimitrios Soudris, Christoph Kessler, August Ernstsson, Johan Ahlqvist, Nikos Vasilas, Athanasios I Papadopoulos, Panos Seferlis, Charles Prouveur, Matthieu Haefele, Samuel Thibault, Athanasios Salamanis, Theodoros Ioakimidis, and Dionysios Kehagias
    EXA2PRO: A Framework for High Development Productivity on Heterogeneous Computing Systems
    IEEE Transactions on Parallel and Distributed Systems, August 2021
    [WWW] [PDF] [doi:10.1109/TPDS.2021.3104257]
  • Rafael Alvares da Silva Lopes, Samuel Thibault, and Alba Cristina Magalhães Alves de Melo
    MASA-StarPU: Parallel Sequence Comparison with Multiple Scheduling Policies and Pruning
    In SBAC-PAD 2020 - IEEE 32nd International Symposium on Computer Architecture and High Performance Computing, Porto, Portugal, September 2020
    [WWW] [PDF] [doi:10.1109/SBAC-PAD49847.2020.00039]
  • Georgios Tzanos, Vineet Soni, Charles Prouveur, Matthieu Haefele, Stavroula Zouzoula, Lazaros Papadopoulos, Samuel Thibault, Nicolas Vandenbergen, Dirk Pleiter, and Dimitrios Soudris
    Applying StarPU runtime system to scientific applications: Experiences and lessons learned
    In Parallel Optimization using/for Multi and Many-core High Performance Computing (POMCO), Barcelona, Spain, December 2020
    [WWW] [PDF]
  • A. AlOnazi, H. Ltaief, D. Keyes, I. Said, and Samuel Thibault
    Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry
    In 2019 IEEE International Conference on Cluster Computing (CLUSTER), Albuquerque, USA, pages 1-11, September 2019
    IEEE
    [WWW] [PDF] [doi:10.1109/CLUSTER.2019.8891054]
  • Essadki, Mohamed, Jung, Jonathan, Larat, Adam, Pelletier, Milan, and Perrier, Vincent
    A Task-Driven Implementation of a Simple Numerical Solver for Hyperbolic Conservation Laws
    ESAIM: ProcS, 63:228-247, 2018
    [WWW] [PDF] [doi:10.1051/proc/201863228]
  • Dimitrios Soudris, Lazaros Papadopoulos, Christoph W Kessler, Dionysios D Kehagias, Athanasios Papadopoulos, Panos Seferlis, Alexander Chatzigeorgiou, Apostolos Ampatzoglou, Samuel Thibault, Raymond Namyst, Dirk Pleiter, Georgi Gaydadjiev, Tobias Becker, and Matthieu Haefele
    EXA2PRO programming environment
    In SAMOS XVIII: Architectures, Modeling, and Simulation, Pythagorion, Greece, pages 202-209, July 2018
    ACM
    [WWW] [PDF] [doi:10.1145/3229631.3239369]
  • Jean Marie Couteyen Carpaye, Jean Roman, and Pierre Brenner
    Design and Analysis of a Task-based Parallelization over a Runtime System of an Explicit Finite-Volume CFD Code with Adaptive Time Stepping
    International Journal of Computational Science and Engineering, pp 1 - 22, 2017
    [WWW] [PDF] [doi:10.1016/j.jocs.2017.03.008]
  • Olivier Aumage, Julien Bigot, Hélène Coullon, Christian Pérez, and Jérôme Richard
    Combining Both a Component Model and a Task-Based Model for HPC Applications: A Feasibility Study on GYSELA
    In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pages 635-644, 2017
    [doi:10.1109/CCGRID.2017.88]
  • Emmanuel Agullo, Alfredo Buttari, Mikko Byckling, Abdou Guermouche, and Ian Masliah
    Achieving high-performance with a sparse direct solver on Intel KNL
    Research Report RR-9035, Inria Bordeaux Sud-Ouest ; CNRS-IRIT ; Intel corporation ; Université Bordeaux, February 2017
    [WWW] [PDF]
  • Nolwenn Balin, Guillaume Sylvand, and Jérôme Robert
    Fast methods applied to BEM solvers for acoustic propagation problems
    In 22nd AIAA/CEAS Aeroacoustics Conference, pages 2712, 2016
  • Emmanuel Agullo, Bérenger Bramas, Olivier Coulaud, Martin Khannouz, and Luka Stanisic
    Task-based fast multipole method for clusters of multicore processors
    Research Report RR-8970, Inria Bordeaux Sud-Ouest, October 2016
    [WWW] [PDF]
  • E Agullo, L Giraud, A Guermouche, S Nakov, and Jean Roman
    Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures
    Research Report 8912, Inria Bordeaux Sud-Ouest, May 2016
    [WWW] [PDF]
  • Corentin Rossignon
    A fine grain model programming for parallelization of sparse linear solver
    PhD thesis, Université de Bordeaux, July 2015
    [WWW] [PDF]
  • Vìctor Martìnez, David Michéa, Fabrice Dupros, Olivier Aumage, Samuel Thibault, Hideo Aochi, and Philippe Olivier Alexandre Navaux
    Towards seismic wave modeling on heterogeneous many-core architectures using task-based runtime system
    In SBAC-PAD - 27th International Symposium on Computer Architecture and High Performance Computing, Florianopolis, Brazil, October 2015
    [WWW] [PDF] [doi:10.1109/SBAC-PAD.2015.33]
  • Emmanuel Agullo, Bérenger Bramas, Olivier Coulaud, Eric Darve, Matthias Messner, and Toru Takahashi
    Task-Based FMM for Multicore Architectures
    SIAM Journal on Scientific Computing, 36(1):66-93, 2014
    [WWW] [PDF] [doi:10.1137/130915662]
  • Sylvain Henry, Alexandre Denis, Denis Barthou, Marie-Christine Counilh, and Raymond Namyst
    Toward OpenCL Automatic Multi-Device Support
    In Fernando Silva, Ines Dutra, and Vitor Santos Costa, editors, Euro-Par 2014, Porto, Portugal, August 2014
    Springer
    [WWW] [PDF]
  • Xavier Lacoste, Mathieu Faverge, Pierre Ramet, Samuel Thibault, and George Bosilca
    Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
    In HCW'2014 - Heterogeneity in Computing Workshop of IPDPS, Phoenix, USA, May 2014
    IEEE
    Note: RR-8446
    [WWW] [PDF] [doi:10.1109/IPDPSW.2014.9]
  • Emmanuel Agullo, Berenger Bramas, Olivier Coulaud, Eric Darve, Matthias Messner, and Toru Takahashi
    Task-based FMM for heterogeneous architectures
    Research Report RR-8513, Inria Bordeaux - Sud-Ouest, April 2014
    [WWW] [PDF]
  • Xavier Lacoste, Mathieu Faverge, Pierre Ramet, Samuel Thibault, and George Bosilca
    Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
    Research Report RR-8446, INRIA, January 2014
    [WWW] [PDF]
  • Emmanuel Agullo, Olivier Aumage, Mathieu Faverge, Nathalie Furmento, Florent Pruvost, Marc Sergent, and Samuel Thibault
    Overview of Distributed Linear Algebra on Hybrid Nodes over the StarPU Runtime
    SIAM Conference on Parallel Processing for Scientific Computing, February 2014
    [WWW] [PDF]
  • Cyril Bordage
    Ordonnancement dynamique, adapté aux architectures hétérogènes, de la méthode multipôle pour les équations de Maxwell, en électromagnétisme
    PhD thesis, Université de Bordeaux, December 2013
    [WWW] [PDF]
  • Sylvain Henry
    Modèles de programmation et supports exécutifs pour architectures hétérogènes
    PhD thesis, Université de Bordeaux, November 2013
    [WWW] [PDF]
  • Sylvain Henry
    ViperVM: a Runtime System for Parallel Functional High-Performance Computing on Heterogeneous Architectures
    In 2nd Workshop on Functional High-Performance Computing (FHPC'13), Boston, USA, September 2013
    [WWW] [PDF]
  • Tetsuya Odajima, Taisuke Boku, Mitsuhisa Sato, Toshihiro Hanawa, Yuetsu Kodama, Raymond Namyst, Samuel Thibault, and Olivier Aumage
    Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing
    In ICA3PP-2013 - The 13th International Conference on Algorithms and Architectures for Parallel Processing, Vietri sul Mare, Italy, December 2013
    [WWW] [PDF] [doi:10.1007/978-3-319-03889-6_7]
  • Satoshi Ohshima, Satoshi Katagiri, Kengo Nakajima, Samuel Thibault, and Raymond Namyst
    Implementation of FEM Application on GPU with StarPU
    In SIAM CSE13 - SIAM Conference on Computational Science and Engineering 2013, Boston, USA, February 2013
    SIAM
    [WWW]
  • Corentin Rossignon
    Optimisation du produit matrice-vecteur creux sur architecture GPU pour un simulateur de reservoir
    In 21èmes Rencontres Francophones du Parallélisme (RenPar'21), Grenoble, France, January 2013
    [WWW] [PDF]
  • Corentin Rossignon, Pascal Hénon, Olivier Aumage, and Samuel Thibault
    A NUMA-aware fine grain parallelization framework for multi-core architecture
    In PDSEC - 14th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing - 2013, Boston, USA, May 2013
    [WWW] [PDF]
  • Sylvain Henry, Alexandre Denis, and Denis Barthou
    Programmation unifiée multi-accélérateur OpenCL
    Techniques et Sciences Informatiques, (8-9-10):1233-1249, 2012
    [WWW] [PDF]
  • Sidi Ahmed Mahmoudi, Pierre Manneback, Cédric Augonnet, and Samuel Thibault
    Traitements d'Images sur Architectures Parallèles et Hétérogènes
    Technique et Science Informatiques, 31(8-10):1183-1203, 2012
    [WWW] [PDF] [doi:10.3166/tsi.31.1183-1203]
  • Siegfried Benkner, Enes Bajrovic, Erich Marth, Martin Sandrieser, Raymond Namyst, and Samuel Thibault
    High-Level Support for Pipeline Parallelism on Many-Core Architectures
    In Euro-Par - 18th International Conference on Parallel Processing, Rhodes Island, Greece, August 2012
    [WWW] [PDF] [doi:10.1007/978-3-642-32820-6_61]
  • Christoph Kessler, Usman Dastgeer, Samuel Thibault, Raymond Namyst, Andrew Richards, Uwe Dolinsky, Siegfried Benkner, Jesper Larsson Träff, and Sabri Pllana
    Programmability and Performance Portability Aspects of Heterogeneous Multi-/Manycore Systems
    In DATE - Design, Automation and Test in Europe, Dresden, Deutschland, March 2012
    ISBN: 978-3-9810801-8-6
    [WWW] [PDF] [doi:10.1109/DATE.2012.6176582]
  • Siegfried Benkner, Sabri Pllana, Jesper Larsson Träff, Philippas Tsigas, Uwe Dolinsky, Cédric Augonnet, Beverly Bachmayer, Christoph Kessler, David Moloney, and Vitaly Osipov
    PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems
    IEEE Micro, 31(5):28-41, September 2011
    ISSN: 0272-1732
    [WWW] [PDF] [doi:10.1109/MM.2011.67]
  • Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Mathieu Faverge, Julien Langou, Hatem Ltaief, and Stanimire Tomov
    LU factorization for accelerator-based systems
    In 9th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 11), Sharm El-Sheikh, Egypt, June 2011
    [WWW] [PDF]
  • Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Mathieu Faverge, Hatem Ltaief, Samuel Thibault, and Stanimire Tomov
    QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators
    In 25th IEEE International Parallel & Distributed Processing Symposium (IEEE IPDPS 2011), Anchorage, Alaska, USA, May 2011
    [WWW] [PDF] [doi:10.1109/IPDPS.2011.90]
  • Usman Dastgeer, Christoph Kessler, and Samuel Thibault
    Flexible runtime support for efficient skeleton programming on hybrid systems
    In ParCo - Proceedings of the International Conference on Parallel Computing, volume 22 of Advances of Parallel Computing, Gent, Belgium, pages 159-166, August 2011
    [WWW] [PDF] [doi:10.3233/978-1-61499-041-3-159]
  • Sylvain Henry
    Programmation multi-accélérateurs unifiée en OpenCL
    In 20èmes Rencontres Francophones du Parallélisme (RenPar'20), Saint Malo, France, May 2011
    [WWW] [PDF]
  • Sidi Ahmed Mahmoudi, Pierre Manneback, Cédric Augonnet, and Samuel Thibault
    Détection optimale des coins et contours dans des bases d'images volumineuses sur architectures multicoeurs hétérogènes
    In RenPar'20 - 20èmes Rencontres Francophones du Parallélisme, Saint-Malo, France, May 2011
    [WWW] [PDF]
  • Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Hatem Ltaief, Raymond Namyst, Samuel Thibault, and Stanimire Tomov
    A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs
    In Wen-mei W. Hwu, editor, GPU Computing Gems, volume 2
    Morgan Kaufmann, September 2010
    [WWW] [PDF] [doi:10.1016/B978-0-12-385963-1.00034-4]
  • Emmanuel Agullo, Cédric Augonnet, Jack Dongarra, Hatem Ltaief, Raymond Namyst, Jean Roman, Samuel Thibault, and Stanimire Tomov
    Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators
    In SAAHPC - Symposium on Application Accelerators in High Performance Computing, Knoxville, USA, July 2010
    [WWW] [PDF]

Author: root

Created: 2023-05-25 Thu 16:09

Validate