Bibliography
Major publications by the team in recent years
-
1C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier.
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, February 2011, vol. 23, pp. 187–198. [ DOI : 10.1002/cpe.1631 ]
http://hal.inria.fr/inria-00550877 -
2F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst.
hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), Pisa, Italia, IEEE Computer Society Press, February 2010, pp. 180–186. [ DOI : 10.1109/PDP.2010.67 ]
http://hal.inria.fr/inria-00429889 -
3F. Broquedis, N. Furmento, B. Goglin, P.-A. Wacrenier, R. Namyst.
ForestGOMP: an efficient OpenMP environment for NUMA architectures, in: International Journal on Parallel Programming, Special Issue on OpenMP; Guest Editors: Matthias S. Müller and Eduard Ayguadé, 2010, vol. 38, no 5, pp. 418-439. [ DOI : 10.1007/s10766-010-0136-3 ]
http://hal.inria.fr/inria-00496295 -
4D. Buntinas, G. Mercier, W. Gropp.
Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, in: Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, Bonn, Germany, September 2006. -
5B. Goglin, N. Furmento.
Finding a Tradeoff between Host Interrupt Load and MPI Latency over Ethernet, in: Proceedings of the IEEE International Conference on Cluster Computing, New Orleans, LA, IEEE Computer Society Press, September 2009.
http://hal.inria.fr/inria-00397328 -
6B. Goglin.
High-Performance Message Passing over generic Ethernet Hardware with Open-MX, in: Journal of Parallel Computing, February 2011, vol. 37, no 2, pp. 85-100. [ DOI : 10.1016/j.parco.2010.11.001 ]
http://hal.inria.fr/inria-00533058/en -
7S. Thibault, R. Namyst, P.-A. Wacrenier.
Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework, in: EuroPar, Rennes,France, ACM, 8 2007.
http://hal.inria.fr/inria-00154506 -
8F. Trahay, É. Brunet, A. Denis, R. Namyst.
A multithreaded communication engine for multicore architectures, in: CAC 2008: Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2008, Miami, FL, IEEE Computer Society Press, April 2008.
http://hal.inria.fr/inria-00224999
Doctoral Dissertations and Habilitation Theses
-
9B. Goglin.
Towards generic Communication Mechanisms and better Affinity Management in Clusters of Hierarchical Nodes, Université de Bordeaux, April 2014, Habilitation à diriger des recherches.
https://tel.archives-ouvertes.fr/tel-00979512 -
10B. Putigny.
Benchmark-driven Approaches to Performance Modeling of Multi-Core Architectures, Université Sciences et Technologies - Bordeaux I, March 2014.
https://tel.archives-ouvertes.fr/tel-00984791
Articles in International Peer-Reviewed Journals
-
11P.-A. Arras, D. Fuin, E. Jeannot, A. Stoutchinin, S. Thibault.
List Scheduling in Embedded Systems Under Memory Constraints, in: International Journal of Parallel Programming, November 2014. [ DOI : 10.1007/s10766-014-0338-1 ]
https://hal.inria.fr/hal-01087067 -
12D. Barthou, O. Brand-Foissac, O. Pene, G. Grosdidier, R. Dolbeau, C. Eisenbeis, M. Kruse, K. Petrov, C. Tadonki.
Automated Code Generation for Lattice Quantum Chromodynamics and beyond, in: Journal of Physics: Conference Series, 2014, vol. 510, 11 p, LPT-Orsay-13-142. [ DOI : 10.1088/1742-6596/510/1/012005 ]
https://hal.inria.fr/hal-00926513 -
13A. Hugo, A. Guermouche, P.-A. Wacrenier, R. Namyst.
Composing multiple StarPU applications over heterogeneous machines: A supervised approach, in: The International Journal of High Performance Computing Applications, February 2014, vol. 28, pp. 285 - 300. [ DOI : 10.1177/1094342014527575 ]
https://hal.inria.fr/hal-01101045 -
14E. Jeannot, G. Mercier, F. Tessier.
Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques, in: IEEE Transactions on Parallel and Distributed Systems, April 2014, vol. 25, no 4, pp. 993- 1002. [ DOI : 10.1109/TPDS.2013.104 ]
https://hal.inria.fr/hal-01109978 -
15E. Saillard, P. Carribault, D. Barthou.
PARCOACH: Combining static and dynamic validation of MPI collective communications, in: International Journal of High Performance Computing Applications, 2014. [ DOI : 10.1177/1094342014552204 ]
https://hal.archives-ouvertes.fr/hal-01078762
International Conferences with Proceedings
-
16M. Alaniz, S. Nesmachnow, B. Goglin, S. Iturriaga, V. Gil Costa, M. Printista.
MBSPDiscover: An Automatic Benchmark for MultiBSP Performance Analysis, in: First HPCLATAM - CLCAR Joint Latin American High Performance Computing Conference, Valparaiso, Chile, Communications in Computer and Information Science (CCIS), Springer, October 2014, vol. 485, pp. 158-172.
https://hal.inria.fr/hal-01062528 -
17D. Barthou, E. Jeannot.
SPAGHETtI: Scheduling/Placement Approach for Task-Graphs on HETerogeneous archItecture, in: Euro-Par, Lisboa, Portugal, LNCS, August 2014, vol. 8632, pp. 174 - 185. [ DOI : 10.1007/978-3-319-09873-9_15 ]
https://hal.archives-ouvertes.fr/hal-01100948 -
18A. Denis.
pioman: a Generic Framework for Asynchronous Progression and Multithreaded Communications, in: IEEE International Conference on Cluster Computing (IEEE Cluster), Madrid, Spain, September 2014.
https://hal.inria.fr/hal-01064652 -
19A. Denis.
pioman: a pthread-based Multithreaded Communication Engine, in: Euromicro International Conference on Parallel, Distributed and Network-based Processing, Turku, Finland, March 2015.
https://hal.inria.fr/hal-01087775 -
20B. Goglin.
Managing the Topology of Heterogeneous Cluster Nodes with Hardware Locality (hwloc), in: International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy, IEEE, July 2014.
https://hal.inria.fr/hal-00985096 -
21B. Goglin, J. Hursey, J. M. Squyres.
netloc: Towards a Comprehensive View of the HPC System Topology, in: Fifth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2014), Minneapolis, United States, IEEE, September 2014.
https://hal.inria.fr/hal-01010599 -
22C. Haine, O. Aumage, P. Enguerrand, D. Barthou.
Exploring and Evaluating Array Layout Restructuration for SIMDization, in: The 27th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2014), Hillsboro, United States, Intel Corporation, September 2014.
https://hal.inria.fr/hal-01070467 -
23S. Henry, A. Denis, D. Barthou, M.-C. Counilh, R. Namyst.
Toward OpenCL Automatic Multi-Device Support, in: Euro-Par 2014, Porto, Portugal, F. Silva, I. Dutra, V. S. Costa (editors), Springer, August 2014.
https://hal.inria.fr/hal-01005765 -
24A.-E. Hugo, A. Guermouche, P.-A. Wacrenier, R. Namyst.
A runtime approach to dynamic resource allocation for sparse direct solvers, in: 2014 43rd International Conference on Parallel Processing, Minneapolis, United States, September 2014. [ DOI : 10.1109/ICPP.2014.57 ]
https://hal.inria.fr/hal-01101054 -
25X. Lacoste, M. Faverge, P. Ramet, S. Thibault, G. Bosilca.
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, in: HCW'2014 workshop of IPDPS, Phoenix, United States, IEEE, May 2014.
https://hal.inria.fr/hal-00987094 -
26B. Putigny, B. Goglin, D. Barthou.
A Benchmark-based Performance Model for Memory-bound HPC Applications, in: International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy, IEEE, July 2014.
https://hal.inria.fr/hal-00985598 -
27B. Putigny, B. Ruelle, B. Goglin.
Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective, in: PDSEC - The 15th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, held in conjunction with IPDPS, Phoenix, AZ, United States, IEEE, May 2014.
https://hal.inria.fr/hal-00956307 -
28E. Saillard, P. Carribault, D. Barthou.
Static Validation of Barriers and Worksharing Constructs in OpenMP Applications, in: IWOMP, Salvador, Brazil, September 2014, pp. 73 - 86. [ DOI : 10.1007/978-3-319-11454-5_6 ]
https://hal.archives-ouvertes.fr/hal-01078759 -
29M. Sergent, S. Archipoff.
Modulariser les ordonnanceurs de tâches : une approche structurelle, in: ComPAS 2014 : conférence en parallélisme, architecture et systèmes, Neuchâtel, Switzerland, P. Felber, L. Philippe, E. Riviere, A. Tisserand (editors), April 2014.
https://hal.inria.fr/hal-00978364 -
30L. Stanisic, S. Thibault, A. Legrand, B. Videau, J.-F. Méhaut.
Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, in: Euro-par - 20th International Conference on Parallel Processing, Porto, Portugal, Euro-Par 2014, LNCS 8632, Springer International Publishing Switzerland, August 2014, pp. 50-62.
https://hal.inria.fr/hal-01011633 -
31G. Vaumourin, D. Thomas, G. Alexandre, D. Barthou.
Specific Read Only Data Management for Memory Hierarchy Optimization, in: EWiLi 2014 - Workshop Embed With Linux, Lisboa, Portugal, J. Boukhobza, J. P. Diguet, P. Ficheux, J. Rufino, F. Singhoff (editors), Proceedings of the Embed With Linux 2014 Workshop, November 2014, vol. Vol-1291, Session 2.
https://hal.archives-ouvertes.fr/hal-01090218 -
32P. Virouleau, P. Brunet, F. Broquedis, N. Furmento, S. Thibault, O. Aumage, T. Gautier.
Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite, in: IWOMP - 10th International Workshop on OpenMP, Salvador, Brazil, France, Springer, September 2014, pp. 16 - 29. [ DOI : 10.1007/978-3-319-11454-5_2 ]
https://hal.inria.fr/hal-01081974
Conferences without Proceedings
-
33E. Jeannot, G. Mercier, F. Tessier.
Matching communication pattern with underlying hardware architecture, in: 6th European Conference on Computational Fluid Dynamics, Barcelona, Spain, July 2014.
https://hal.inria.fr/hal-01087611
Scientific Books (or Scientific Book chapters)
-
34P. De Oliveira Castro, S. Louise, D. Barthou.
DSL Stream Programming on Multicore Architectures, in: Programming multi-core and many-core computing systems, John Wiley and Sons, 2014, chapter 12.
https://hal.archives-ouvertes.fr/hal-00952318 -
35T. Hoefler, E. Jeannot, G. Mercier.
An Overview of Process Mapping Techniques and Algorithms in High-Performance Computing, in: High Performance Computing on Complex Environments, E. Jeannot, J. Žilinskas (editors), Wiley, June 2014, pp. 75-94.
https://hal.inria.fr/hal-00921626 -
36L. Lopez, J. Žilinskas, A. Costan, R. G. Cascella, G. Kecskemeti, E. Jeannot, M. Cannataro, L. Ricci, S. Benkner, S. Petit, V. Scarano, J. Gracia, S. Hunold, S. L. Scott, S. Lankes, C. Lengauer, J. Carretero, J. Breitbart, M. Alexander.
Euro-Par 2014: Parallel Processing Workshops, Part I, Lecture Note In Computer Science, Springer, December 2014, vol. 8805.
https://hal.inria.fr/hal-01110069 -
37L. Lopez, J. Žilinskas, A. Costan, R. G. Cascella, G. Kecskemeti, E. Jeannot, M. Cannataro, L. Ricci, S. Benkner, S. Petit, V. Scarano, J. Gracia, S. Hunold, S. L. Scott, S. Lankes, C. Lengauer, J. Carretero, J. Breitbart, M. Alexander.
Euro-Par 2014: Parallel Processing Workshops, Part II, Lecture Note In Computer Science, Springer, December 2014, vol. 8806.
https://hal.inria.fr/hal-01110071
Books or Proceedings Editing
-
38E. Jeannot, J. Žilinskas (editors)
High Performance Computing on Complex Environments, Wiley, June 2014, 512 p.
https://hal.inria.fr/hal-00921619
Internal Reports
-
39C. Augonnet, O. Aumage, N. Furmento, S. Thibault, R. Namyst.
StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators, May 2014, no RR-8538.
https://hal.inria.fr/hal-00992208 -
40X. Lacoste, M. Faverge, P. Ramet, S. Thibault, G. Bosilca.
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, January 2014, no RR-8446, 25 p.
https://hal.inria.fr/hal-00925017 -
41L. Stanisic, S. Thibault, A. Legrand, B. Videau, J.-F. Méhaut.
Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, March 2014, no RR-8509.
https://hal.inria.fr/hal-00966862 -
42A. Tate, A. Kamil, A. Dubey, A. Größlinger, B. Chamberlain, B. Goglin, C. Edwards, C. J. Newburn, D. Padua, D. Unat, E. Jeannot, F. Hannig, T. Gysi, H. Ltaief, J. Sexton, J. Labarta, J. Shalf, K. Fürlinger, K. O’Brien, L. Linardakis, M. Besta, M.-C. Sawley, M. Abraham, M. Bianco, M. Pericàs, N. Maruyama, P. H. J. Kelly, P. Messmer, R. B. Ross, R. Cledat, S. Matsuoka, T. Schulthess, T. Hoefler, V. J. Leung.
Programming Abstractions for Data Locality, PADAL Workshop 2014, April 28–29, Swiss National Supercomputing Center (CSCS), Lugano, Switzerland, November 2014, 54 p.
https://hal.inria.fr/hal-01083080
Scientific Popularization
-
43E. Agullo, O. Aumage, M. Faverge, N. Furmento, F. Pruvost, M. Sergent, S. Thibault.
Overview of Distributed Linear Algebra on Hybrid Nodes over the StarPU Runtime, February 2014, SIAM Conference on Parallel Processing for Scientific Computing.
https://hal.inria.fr/hal-00978602
-
44P. Balaji, H.-W. Jin, K. Vaidyanathan, D. K. Panda.
Supporting iWARP Compatibility and Features for Regular Network Adapters, in: Proceedings of the Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT); held in conjunction with the IEEE International Confer ence on Cluster Computing, Boston, MA, September 2005. -
45G. Ciaccio, G. Chiola.
GAMMA and MPI/GAMMA on GigabitEthernet, in: Proceedings of 7th EuroPVM-MPI conference, Balatonfured, Hongrie, Lecture Notes in Computer Science, Springer Verlag, Septembre 2000, vol. 1908. -
46G. R. Gao, T. Sterling, R. Stevens, M. Hereld, W. Zhu.
Hierarchical multithreading: programming model and system software, in: 20th International Parallel and Distributed Processing Symposium (IPDPS), April 2006.