Bibliography

Major publications by the team in recent years

1C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier.

StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par 2009, February 2011, vol. 23, pp. 187–198. [ DOI : 10.1002/cpe.1631 ]

http://hal.inria.fr/inria-00550877
2F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst.

hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), Pisa, Italia, IEEE Computer Society Press, February 2010, pp. 180–186. [ DOI : 10.1109/PDP.2010.67 ]

http://hal.inria.fr/inria-00429889
3F. Broquedis, N. Furmento, B. Goglin, P.-A. Wacrenier, R. Namyst.

ForestGOMP: an efficient OpenMP environment for NUMA architectures, in: International Journal on Parallel Programming, Special Issue on OpenMP; Guest Editors: Matthias S. Müller and Eduard Ayguadé, 2010, vol. 38, n^o 5, pp. 418-439. [ DOI : 10.1007/s10766-010-0136-3 ]

http://hal.inria.fr/inria-00496295
4D. Buntinas, G. Mercier, W. Gropp.

Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, in: Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, Bonn, Germany, September 2006.
5B. Goglin, N. Furmento.

Finding a Tradeoff between Host Interrupt Load and MPI Latency over Ethernet, in: Proceedings of the IEEE International Conference on Cluster Computing, New Orleans, LA, IEEE Computer Society Press, September 2009.

http://hal.inria.fr/inria-00397328
6B. Goglin.

High-Performance Message Passing over generic Ethernet Hardware with Open-MX, in: Journal of Parallel Computing, February 2011, vol. 37, n^o 2, pp. 85-100. [ DOI : 10.1016/j.parco.2010.11.001 ]

http://hal.inria.fr/inria-00533058/en
7S. Thibault, R. Namyst, P.-A. Wacrenier.

Building Portable Thread Schedulers for Hierarchical Multiprocessors: the BubbleSched Framework, in: EuroPar, Rennes,France, ACM, 8 2007.

http://hal.inria.fr/inria-00154506
8F. Trahay, É. Brunet, A. Denis, R. Namyst.

A multithreaded communication engine for multicore architectures, in: CAC 2008: Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2008, Miami, FL, IEEE Computer Society Press, April 2008.

http://hal.inria.fr/inria-00224999

Publications of the year

Doctoral Dissertations and Habilitation Theses

9B. Goglin.

Towards generic Communication Mechanisms and better Affinity Management in Clusters of Hierarchical Nodes, Université de Bordeaux, April 2014, Habilitation à diriger des recherches.

https://tel.archives-ouvertes.fr/tel-00979512
10B. Putigny.

Benchmark-driven Approaches to Performance Modeling of Multi-Core Architectures, Université Sciences et Technologies - Bordeaux I, March 2014.

https://tel.archives-ouvertes.fr/tel-00984791

Articles in International Peer-Reviewed Journals

11P.-A. Arras, D. Fuin, E. Jeannot, A. Stoutchinin, S. Thibault.

List Scheduling in Embedded Systems Under Memory Constraints, in: International Journal of Parallel Programming, November 2014. [ DOI : 10.1007/s10766-014-0338-1 ]

https://hal.inria.fr/hal-01087067
12D. Barthou, O. Brand-Foissac, O. Pene, G. Grosdidier, R. Dolbeau, C. Eisenbeis, M. Kruse, K. Petrov, C. Tadonki.

Automated Code Generation for Lattice Quantum Chromodynamics and beyond, in: Journal of Physics: Conference Series, 2014, vol. 510, 11 p, LPT-Orsay-13-142. [ DOI : 10.1088/1742-6596/510/1/012005 ]

https://hal.inria.fr/hal-00926513
13A. Hugo, A. Guermouche, P.-A. Wacrenier, R. Namyst.

Composing multiple StarPU applications over heterogeneous machines: A supervised approach, in: The International Journal of High Performance Computing Applications, February 2014, vol. 28, pp. 285 - 300. [ DOI : 10.1177/1094342014527575 ]

https://hal.inria.fr/hal-01101045
14E. Jeannot, G. Mercier, F. Tessier.

Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques, in: IEEE Transactions on Parallel and Distributed Systems, April 2014, vol. 25, n^o 4, pp. 993- 1002. [ DOI : 10.1109/TPDS.2013.104 ]

https://hal.inria.fr/hal-01109978
15E. Saillard, P. Carribault, D. Barthou.

PARCOACH: Combining static and dynamic validation of MPI collective communications, in: International Journal of High Performance Computing Applications, 2014. [ DOI : 10.1177/1094342014552204 ]

https://hal.archives-ouvertes.fr/hal-01078762

International Conferences with Proceedings

16M. Alaniz, S. Nesmachnow, B. Goglin, S. Iturriaga, V. Gil Costa, M. Printista.

MBSPDiscover: An Automatic Benchmark for MultiBSP Performance Analysis, in: First HPCLATAM - CLCAR Joint Latin American High Performance Computing Conference, Valparaiso, Chile, Communications in Computer and Information Science (CCIS), Springer, October 2014, vol. 485, pp. 158-172.

https://hal.inria.fr/hal-01062528
17D. Barthou, E. Jeannot.

SPAGHETtI: Scheduling/Placement Approach for Task-Graphs on HETerogeneous archItecture, in: Euro-Par, Lisboa, Portugal, LNCS, August 2014, vol. 8632, pp. 174 - 185. [ DOI : 10.1007/978-3-319-09873-9_15 ]

https://hal.archives-ouvertes.fr/hal-01100948
18A. Denis.

pioman: a Generic Framework for Asynchronous Progression and Multithreaded Communications, in: IEEE International Conference on Cluster Computing (IEEE Cluster), Madrid, Spain, September 2014.

https://hal.inria.fr/hal-01064652
19A. Denis.

pioman: a pthread-based Multithreaded Communication Engine, in: Euromicro International Conference on Parallel, Distributed and Network-based Processing, Turku, Finland, March 2015.

https://hal.inria.fr/hal-01087775
20B. Goglin.

Managing the Topology of Heterogeneous Cluster Nodes with Hardware Locality (hwloc), in: International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy, IEEE, July 2014.

https://hal.inria.fr/hal-00985096
21B. Goglin, J. Hursey, J. M. Squyres.

netloc: Towards a Comprehensive View of the HPC System Topology, in: Fifth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2014), Minneapolis, United States, IEEE, September 2014.

https://hal.inria.fr/hal-01010599
22C. Haine, O. Aumage, P. Enguerrand, D. Barthou.

Exploring and Evaluating Array Layout Restructuration for SIMDization, in: The 27th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2014), Hillsboro, United States, Intel Corporation, September 2014.

https://hal.inria.fr/hal-01070467
23S. Henry, A. Denis, D. Barthou, M.-C. Counilh, R. Namyst.

Toward OpenCL Automatic Multi-Device Support, in: Euro-Par 2014, Porto, Portugal, F. Silva, I. Dutra, V. S. Costa (editors), Springer, August 2014.

https://hal.inria.fr/hal-01005765
24A.-E. Hugo, A. Guermouche, P.-A. Wacrenier, R. Namyst.

A runtime approach to dynamic resource allocation for sparse direct solvers, in: 2014 43rd International Conference on Parallel Processing, Minneapolis, United States, September 2014. [ DOI : 10.1109/ICPP.2014.57 ]

https://hal.inria.fr/hal-01101054
25X. Lacoste, M. Faverge, P. Ramet, S. Thibault, G. Bosilca.

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, in: HCW'2014 workshop of IPDPS, Phoenix, United States, IEEE, May 2014.

https://hal.inria.fr/hal-00987094
26B. Putigny, B. Goglin, D. Barthou.

A Benchmark-based Performance Model for Memory-bound HPC Applications, in: International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy, IEEE, July 2014.

https://hal.inria.fr/hal-00985598
27B. Putigny, B. Ruelle, B. Goglin.

Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective, in: PDSEC - The 15th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, held in conjunction with IPDPS, Phoenix, AZ, United States, IEEE, May 2014.

https://hal.inria.fr/hal-00956307
28E. Saillard, P. Carribault, D. Barthou.

Static Validation of Barriers and Worksharing Constructs in OpenMP Applications, in: IWOMP, Salvador, Brazil, September 2014, pp. 73 - 86. [ DOI : 10.1007/978-3-319-11454-5_6 ]

https://hal.archives-ouvertes.fr/hal-01078759
29M. Sergent, S. Archipoff.

Modulariser les ordonnanceurs de tâches : une approche structurelle, in: ComPAS 2014 : conférence en parallélisme, architecture et systèmes, Neuchâtel, Switzerland, P. Felber, L. Philippe, E. Riviere, A. Tisserand (editors), April 2014.

https://hal.inria.fr/hal-00978364
30L. Stanisic, S. Thibault, A. Legrand, B. Videau, J.-F. Méhaut.

Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, in: Euro-par - 20th International Conference on Parallel Processing, Porto, Portugal, Euro-Par 2014, LNCS 8632, Springer International Publishing Switzerland, August 2014, pp. 50-62.

https://hal.inria.fr/hal-01011633
31G. Vaumourin, D. Thomas, G. Alexandre, D. Barthou.

Specific Read Only Data Management for Memory Hierarchy Optimization, in: EWiLi 2014 - Workshop Embed With Linux, Lisboa, Portugal, J. Boukhobza, J. P. Diguet, P. Ficheux, J. Rufino, F. Singhoff (editors), Proceedings of the Embed With Linux 2014 Workshop, November 2014, vol. Vol-1291, Session 2.

https://hal.archives-ouvertes.fr/hal-01090218
32P. Virouleau, P. Brunet, F. Broquedis, N. Furmento, S. Thibault, O. Aumage, T. Gautier.

Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite, in: IWOMP - 10th International Workshop on OpenMP, Salvador, Brazil, France, Springer, September 2014, pp. 16 - 29. [ DOI : 10.1007/978-3-319-11454-5_2 ]

https://hal.inria.fr/hal-01081974

Conferences without Proceedings

33E. Jeannot, G. Mercier, F. Tessier.

Matching communication pattern with underlying hardware architecture, in: 6th European Conference on Computational Fluid Dynamics, Barcelona, Spain, July 2014.

https://hal.inria.fr/hal-01087611

Scientific Books (or Scientific Book chapters)

34P. De Oliveira Castro, S. Louise, D. Barthou.

DSL Stream Programming on Multicore Architectures, in: Programming multi-core and many-core computing systems, John Wiley and Sons, 2014, chapter 12.

https://hal.archives-ouvertes.fr/hal-00952318
35T. Hoefler, E. Jeannot, G. Mercier.

An Overview of Process Mapping Techniques and Algorithms in High-Performance Computing, in: High Performance Computing on Complex Environments, E. Jeannot, J. Žilinskas (editors), Wiley, June 2014, pp. 75-94.

https://hal.inria.fr/hal-00921626
36L. Lopez, J. Žilinskas, A. Costan, R. G. Cascella, G. Kecskemeti, E. Jeannot, M. Cannataro, L. Ricci, S. Benkner, S. Petit, V. Scarano, J. Gracia, S. Hunold, S. L. Scott, S. Lankes, C. Lengauer, J. Carretero, J. Breitbart, M. Alexander.

Euro-Par 2014: Parallel Processing Workshops, Part I, Lecture Note In Computer Science, Springer, December 2014, vol. 8805.

https://hal.inria.fr/hal-01110069
37L. Lopez, J. Žilinskas, A. Costan, R. G. Cascella, G. Kecskemeti, E. Jeannot, M. Cannataro, L. Ricci, S. Benkner, S. Petit, V. Scarano, J. Gracia, S. Hunold, S. L. Scott, S. Lankes, C. Lengauer, J. Carretero, J. Breitbart, M. Alexander.

Euro-Par 2014: Parallel Processing Workshops, Part II, Lecture Note In Computer Science, Springer, December 2014, vol. 8806.

https://hal.inria.fr/hal-01110071

Books or Proceedings Editing

38E. Jeannot, J. Žilinskas (editors)

High Performance Computing on Complex Environments, Wiley, June 2014, 512 p.

https://hal.inria.fr/hal-00921619

Internal Reports

39C. Augonnet, O. Aumage, N. Furmento, S. Thibault, R. Namyst.

StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators, May 2014, n^o RR-8538.

https://hal.inria.fr/hal-00992208
40X. Lacoste, M. Faverge, P. Ramet, S. Thibault, G. Bosilca.

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, January 2014, n^o RR-8446, 25 p.

https://hal.inria.fr/hal-00925017
41L. Stanisic, S. Thibault, A. Legrand, B. Videau, J.-F. Méhaut.

Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, March 2014, n^o RR-8509.

https://hal.inria.fr/hal-00966862
42A. Tate, A. Kamil, A. Dubey, A. Größlinger, B. Chamberlain, B. Goglin, C. Edwards, C. J. Newburn, D. Padua, D. Unat, E. Jeannot, F. Hannig, T. Gysi, H. Ltaief, J. Sexton, J. Labarta, J. Shalf, K. Fürlinger, K. O’Brien, L. Linardakis, M. Besta, M.-C. Sawley, M. Abraham, M. Bianco, M. Pericàs, N. Maruyama, P. H. J. Kelly, P. Messmer, R. B. Ross, R. Cledat, S. Matsuoka, T. Schulthess, T. Hoefler, V. J. Leung.

Programming Abstractions for Data Locality, PADAL Workshop 2014, April 28–29, Swiss National Supercomputing Center (CSCS), Lugano, Switzerland, November 2014, 54 p.

https://hal.inria.fr/hal-01083080

Scientific Popularization

43E. Agullo, O. Aumage, M. Faverge, N. Furmento, F. Pruvost, M. Sergent, S. Thibault.

Overview of Distributed Linear Algebra on Hybrid Nodes over the StarPU Runtime, February 2014, SIAM Conference on Parallel Processing for Scientific Computing.

https://hal.inria.fr/hal-00978602

References in notes

44P. Balaji, H.-W. Jin, K. Vaidyanathan, D. K. Panda.

Supporting iWARP Compatibility and Features for Regular Network Adapters, in: Proceedings of the Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT); held in conjunction with the IEEE International Confer ence on Cluster Computing, Boston, MA, September 2005.
45G. Ciaccio, G. Chiola.

GAMMA and MPI/GAMMA on GigabitEthernet, in: Proceedings of 7th EuroPVM-MPI conference, Balatonfured, Hongrie, Lecture Notes in Computer Science, Springer Verlag, Septembre 2000, vol. 1908.
46G. R. Gao, T. Sterling, R. Stevens, M. Hereld, W. Zhu.

Hierarchical multithreading: programming model and system software, in: 20th International Parallel and Distributed Processing Symposium (IPDPS), April 2006.

Previous |

Home