Bibliography

Major publications by the team in recent years

1F. Bodin, T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, E. Rohou.

Iterative Compilation in a Non-Linear Optimisation Space, in: Workshop on Profile and Feedback-Directed Compilation (FDO-1), in conjunction with PACT '98, October 1998.
2A. Cohen, E. Rohou.

Processor Virtualization and Split Compilation for Heterogeneous Multicore Embedded Systems, in: DAC, June 2010, pp. 102–107.
3N. Hallou, E. Rohou, P. Clauss, A. Ketterlin.

Dynamic Re-Vectorization of Binary Code, in: SAMOS, July 2015.

https://hal.inria.fr/hal-01155207
4D. Hardy, I. Puaut.

Static probabilistic Worst Case Execution Time Estimation for architectures with Faulty Instruction Caches, in: 21st International Conference on Real-Time Networks and Systems, Sophia Antipolis, France, October 2013. [ DOI : 10.1145/2516821.2516842 ]

https://hal.inria.fr/hal-00862604
5D. Hardy, I. Sideris, N. Ladas, Y. Sazeides.

The performance vulnerability of architectural and non-architectural arrays to permanent faults, in: MICRO 45, Vancouver, Canada, December 2012.

https://hal.inria.fr/hal-00747488
6D. Hardy, I. Sideris, A. Saidi, Y. Sazeides.

EETCO: A tool to estimate and explore the implications of datacenter design choices on the tco and the environmental impact, in: Workshop on Energy-efficient Computing for a Sustainable World in conjunction with the 44th Annual IEEE/ACM International Symposium on Microarchitecture (Micro-44), 2011.
7S. Kalathingal, S. Collange, B. Narasimha Swamy, A. Seznec.

Dynamic Inter-Thread Vectorization Architecture: extracting DLP from TLP, in: International Symposium on Computer Architecture and High-Performance Computing (SBAC-PAD), Los Angeles, United States, October 2016.

https://hal.inria.fr/hal-01356202
8M.-K. Lee, P. Michaud, J. S. Sim, D. Nyang.

A simple proof of optimality for the MIN cache replacement policy, in: Information Processing Letters, September 2015, 3 p. [ DOI : 10.1016/j.ipl.2015.09.004 ]

https://hal.inria.fr/hal-01199424
9P. Michaud.

A Best-Offset Prefetcher Champion, in: 2nd Data Prefetching Championship, Portland, OR, USA, June 2015.

https://hal.inria.fr/hal-01165600
10P. Michaud, A. Mondelli, A. Seznec.

Revisiting Clustered Microarchitecture for Future Superscalar Cores: A Case for Wide Issue Clusters, in: ACM Transactions on Architecture and Code Optimization (TACO) , August 2015, vol. 13, n^o 3, 22 p. [ DOI : 10.1145/2800787 ]

https://hal.inria.fr/hal-01193178
11P. Michaud, A. Seznec.

Pushing the branch predictability limits with the multi-poTAGE+SC predictor : Champion in the unlimited category, in: 4th JILP Workshop on Computer Architecture Competitions (JWAC-4): Championship Branch Prediction (CBP-4), Minneapolis, United States, June 2014.

https://hal.archives-ouvertes.fr/hal-01087719
12A. Perais.

Increasing the performance of superscalar processors through value prediction, Université Rennes 1, September 2015.

https://tel.archives-ouvertes.fr/tel-01282474
13A. Perais, A. Seznec.

EOLE: Paving the Way for an Effective Implementation of Value Prediction, in: International Symposium on Computer Architecture, Minneapolis, MN, United States, ACM/IEEE, June 2014, vol. 42, pp. 481 - 492. [ DOI : 10.1109/ISCA.2014.6853205 ]

https://hal.inria.fr/hal-01088130
14A. Perais, A. Seznec.

Practical data value speculation for future high-end processors, in: International Symposium on High Performance Computer Architecture, Orlando, FL, United States, IEEE, February 2014, pp. 428 - 439. [ DOI : 10.1109/HPCA.2014.6835952 ]

https://hal.inria.fr/hal-01088116
15A. Perais, A. Seznec.

EOLE: Toward a Practical Implementation of Value Prediction, in: IEEE Micro, June 2015, vol. 35, n^o 3, pp. 114 - 124. [ DOI : 10.1109/MM.2015.45 ]

https://hal.inria.fr/hal-01193287
16E. Riou, E. Rohou, P. Clauss, N. Hallou, A. Ketterlin.

PADRONE: a Platform for Online Profiling, Analysis, and Optimization, in: Dynamic Compilation Everywhere, Vienna, Austria, January 2014.
17S. Sardashti, A. Seznec, D. A. Wood.

Skewed Compressed Caches, in: 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014, Minneapolis, United States, December 2014.

https://hal.inria.fr/hal-01088050
18A. Sembrant, T. Carlson, E. Hagersten, D. Black-Shaffer, A. Perais, A. Seznec, P. Michaud.

Long Term Parking (LTP): Criticality-aware Resource Allocation in OOO Processors, in: International Symposium on Microarchitecture, Micro 2015, Honolulu, United States, Proceeding of the International Symposium on Microarchitecture, Micro 2015, ACM, December 2015.

https://hal.inria.fr/hal-01225019
19A. Seznec, P. Michaud.

A case for (partially)-tagged geometric history length predictors, in: Journal of Instruction Level Parallelism, April 2006.

http://www.jilp.org/vol8
20A. Seznec, J. San Miguel, J. Albericio.

The Inner Most Loop Iteration counter: a new dimension in branch history , in: 48th International Symposium On Microarchitecture, Honolulu, United States, ACM, December 2015, 11 p.

https://hal.inria.fr/hal-01208347
21A. Seznec.

A New Case for the TAGE Branch Predictor, in: MICRO 2011 : The 44th Annual IEEE/ACM International Symposium on Microarchitecture, 2011, Porto Allegre, Brazil, ACM (editor), ACM-IEEE, December 2011.

https://hal.inria.fr/hal-00639193
22A. Seznec.

TAGE-SC-L Branch Predictors: Champion in 32Kbits and 256 Kbits category, in: JILP - Championship Branch Prediction, Minneapolis, United States, June 2014.

https://hal.inria.fr/hal-01086920
23A. Suresh, B. Narasimha Swamy, E. Rohou, A. Seznec.

Intercepting Functions for Memoization: A Case Study Using Transcendental Functions, in: ACM Transactions on Architecture and Code Optimization (TACO) , July 2015, vol. 12, n^o 2, 23 p. [ DOI : 10.1145/2751559 ]

https://hal.inria.fr/hal-01178085
24A. Suresh.

Intercepting functions for memoization, Université Rennes 1, May 2016.

https://tel.archives-ouvertes.fr/tel-01410539
25D. D. C. Teixeira, S. Collange, F. M. Q. Pereira.

Fusion of calling sites, in: International Symposium on Computer Architecture and High-Performance Computing (SBAC-PAD), Florianópolis, Santa Catarina, Brazil, October 2015. [ DOI : 10.1109/SBAC-PAD.2015.16 ]

https://hal.archives-ouvertes.fr/hal-01410221

Publications of the year

Doctoral Dissertations and Habilitation Theses

26N. Hallou.

Runtime Optimization of Binary Through Vectorization Transformations, Université de Rennes 1 [UR1], December 2017.

https://hal.inria.fr/tel-01672263
27A. Mondelli.

Revisiting Wide Superscalar Microarchitecture, Université de Rennes 1, September 2017.

https://hal.inria.fr/tel-01597752

Articles in International Peer-Reviewed Journals

28F. Endo, A. Perais, A. Seznec.

On the Interactions Between Value Prediction and Compiler Optimizations in the Context of EOLE, in: ACM Transactions on Architecture and Code Optimization, June 2017.

https://hal.inria.fr/hal-01519869
29N. Hallou, E. Rohou, P. Clauss.

Runtime Vectorization Transformations of Binary Code, in: International Journal of Parallel Programming, June 2017, vol. 8, n^o 6, pp. 1536 - 1565. [ DOI : 10.1007/s10766-016-0480-z ]

https://hal.inria.fr/hal-01593216
30S. Kalathingal, S. Collange, B. Swamy, A. Seznec.

DITVA: Dynamic Inter-Thread Vectorization Architecture, in: Journal of Parallel and Distributed Computing, 2017, pp. 1-32, forthcoming. [ DOI : 10.1016/j.jpdc.2017.11.006 ]

https://hal.archives-ouvertes.fr/hal-01655904
31B. Rouxel, S. Derrien, I. Puaut.

Tightening Contention Delays While Scheduling Parallel Applications on Multi-core Architectures, in: ACM Transactions on Embedded Computing Systems (TECS), October 2017, vol. 16, n^o 5s, pp. 1 - 20. [ DOI : 10.1145/3126496 ]

https://hal.archives-ouvertes.fr/hal-01655383
32A. Sridharan, B. Panda, A. Seznec.

A Band-pass Prefetching : An Effective Prefetch Management Mechanism using Prefetch-fraction Metric in Multi-core Systems, in: ACM Transactions on Architecture and Code Optimization, June 2017.

https://hal.inria.fr/hal-01519648
33A. Sridharan, A. Seznec.

Dynamic and Discrete Cache Insertion Policies for Managing Shared Last Level Caches in Large Multicores, in: Journal of Parallel and Distributed Computing, May 2017, vol. 106, pp. 215–226. [ DOI : 10.1016/j.jpdc.2017.02.004 ]

https://hal.inria.fr/hal-01519650

Invited Conferences

34C. Silvano, A. Bartolini, A. Beccari, C. Manelfi, C. Cavazzoni, D. Gadioli, E. Rohou, G. Palermo, G. Agosta, J. Martinovič, J. Bispo, J. M. P. Cardoso, J. Barbosa, K. Slaninová, L. Benini, M. Palkovič, N. Sanna, P. Pinto, R. Cmar, R. Nobre, S. Cherubin.

The ANTAREX Tool Flow for Monitoring and Autotuning Energy Efficient HPC Systems, in: SAMOS 2017 - International Conference on Embedded Computer Systems: Architecture, Modeling and Simulation, Pythagorion, Greece, July 2017.

https://hal.inria.fr/hal-01615945

International Conferences with Proceedings

35V. A. Anh Nguyen, D. Hardy, I. Puaut.

Cache-conscious offline real-time task scheduling for multi-core processors, in: 29th Euromicro Conference on Real-Time Systems (ECRTS17), Dubrovnik, Croatia, Euromicro Conference on Real-Time Systems, June 2017. [ DOI : 10.4230/LIPIcs.ECRTS.2017.14 ]

http://hal.upmc.fr/hal-01590421
36A. A. Ap, E. Rohou.

Dynamic Function Specialization, in: International Conference on Embedded Computer Systems: Architectures, MOdeling and Simulation, Pythagorion, Samos, Greece, July 2017.

https://hal.inria.fr/hal-01597880
37R. Bouziane, E. Rohou, A. Gamatié.

How Could Compile-Time Program Analysis help Leveraging Emerging NVM Features?, in: EDIS 2017 - First international conference on Embedded & Distributed Systems, Oran, Algeria, December 2017, pp. 1-6.

https://hal.inria.fr/hal-01655195
38R. Bouziane, E. Rohou, A. Gamatié.

Compile-Time Silent-Store Elimination for Energy Efficiency: an Analytic Evaluation for Non-Volatile Cache Memory, in: RAPIDO 2018 - 10th Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools, Manchester, United Kingdom, HiPEAC, January 2018, pp. 1-8.

https://hal.inria.fr/hal-01660686
39S. Cherubin, G. Agosta, I. Lasri, E. Rohou, O. Sentieys.

Implications of Reduced-Precision Computations in HPC: Performance, Energy and Error, in: International Conference on Parallel Computing (ParCo), Bologna, Italy, September 2017.

https://hal.inria.fr/hal-01633790
40S. Collange.

Simty: generalized SIMT execution on RISC-V, in: First Workshop on Computer Architecture Research with RISC-V (CARRV 2017), Boston, United States, First Workshop on Computer Architecture Research with RISC-V, October 2017, 6 p.

https://hal.inria.fr/hal-01622208
41S. Derrien, I. Puaut, P. Alefragis, M. Bednara, H. Bucher, C. David, Y. Debray, U. Durak, I. Fassi, C. Ferdinand, D. Hardy, A. Kritikakou, G. Rauwerda, S. Reder, M. Sicks, T. Stripf, K. Sunesen, T. Ter Braak, N. Voros, J. †. Becker.

WCET-aware parallelization of model-based applications for multi-cores: The ARGO approach, in: Design Automation and Test in Europe (DATE), 2017, Lausanne, Switzerland, March 2017, pp. 286 - 289. [ DOI : 10.23919/DATE.2017.7927000 ]

http://hal.upmc.fr/hal-01590418
42D. Hardy, B. Rouxel, I. Puaut.

The Heptane Static Worst-Case Execution Time Estimation Tool, in: 17th International Workshop on Worst-Case Execution Time Analysis (WCET 2017), Dubrovnik, Croatia, International Workshop on Worst-Case Execution Time Analysis, June 2017, vol. 8, 12 p. [ DOI : 10.4230/OASIcs.WCET.2017.8 ]

http://hal.upmc.fr/hal-01590444
43C. Maiza, P. Raymond, C. Parent-Vigouroux, A. Bonenfant, F. Carrier, H. Cassé, P. Cuenot, D. Claraz, N. Halbwachs, E. Jahier, H. Li, M. De Michiel, V. Mussot, I. Puaut, C. Rochange, E. Rohou, J. Ruiz, P. Sotin, W.-T. Su.

The W-SEPT Project: Towards Semantic-Aware WCET Estimation, in: 17th International Workshop on Worst-Case Execution Time Analysis (WCET 2017), Dubrovnik, Croatia, International Workshop on Worst-Case Execution Time Analysis, June 2017, 13 p. [ DOI : 10.4230/OASIcs.WCET.2017.9 ]

http://hal.upmc.fr/hal-01590442
44S. Martinez, D. Hardy, I. Puaut.

Quantifying WCET reduction of parallel applications by introducing slack time to limit resource contention, in: International Conference on Real-Time Networks and Systems (RTNS), 2017, Grenoble, France, International Conference on Real-Time Networks and Systems, October 2017. [ DOI : 10.475/123_4 ]

http://hal.upmc.fr/hal-01590532
45R. E. A. Moreira, S. Collange, F. M. Q. Pereira.

Function Call Re-Vectorization, in: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Austin, Texas, United States, February 2017.

https://hal.archives-ouvertes.fr/hal-01410186
46S. Rokicki, E. Rohou, S. Derrien.

Hardware-Accelerated Dynamic Binary Translation, in: IEEE/ACM Design, Automation & Test in Europe Conference & Exhibition (DATE), Lausanne, Switzerland, March 2017.

https://hal.inria.fr/hal-01423639
47B. Rouxel, S. Derrien, I. Puaut.

Tightening contention delays while scheduling parallel applications on multi-core architectures, in: International Conference on Embedded Software (EMSOFT), 2017, Seoul, South Korea, International Conference on Embedded Software, October 2017, 20 p. [ DOI : 10.1145/3126496 ]

http://hal.upmc.fr/hal-01590508
48B. Rouxel, I. Puaut.

STR2RTS: Refactored StreamIT benchmarks into statically analyzable parallel benchmarks for WCET estimation & real-time scheduling, in: 17th International Workshop on Worst-Case Execution Time Analysis (WCET 2017), Dubrovnik, Croatia, June 2017. [ DOI : 10.4230/OASIcs.WCET.2017.1 ]

http://hal.upmc.fr/hal-01590446
49M. Y. Siraichi, V. F. d. Santos, S. Collange, F. M. Q. Pereira.

Qubit Allocation, in: CGO 2018 - International Symposium on Code Generation and Optimization, Vienna, Austria, February 2018, pp. 1-12. [ DOI : 10.1145/3168822 ]

https://hal.archives-ouvertes.fr/hal-01655951
50A. Suresh, E. Rohou, A. Seznec.

Compile-Time Function Memoization, in: 26th International Conference on Compiler Construction, Austin, United States, February 2017.

https://hal.inria.fr/hal-01423811

National Conferences with Proceedings

51S. Collange, N. Brunie.

Parcours par liste de chemins : une nouvelle classe de mécanismes de suivi de flot SIMT, in: Conférence d’informatique en Parallélisme, Architecture et Système (ComPAS), Sophia Antipolis, France, June 2017.

https://hal.inria.fr/hal-01522901

Internal Reports

52S. Collange, N. Brunie.

Path list traversal: a new class of SIMT flow tracking mechanisms, Inria Rennes - Bretagne Atlantique, June 2017, n^o RR-9073.

https://hal.inria.fr/hal-01533085

Other Publications

53S. Rokicki, E. Rohou, S. Derrien.

Supporting Runtime Reconfigurable VLIWs Cores Through Dynamic Binary Translation, December 2017, working paper or preprint.

https://hal.archives-ouvertes.fr/hal-01653110

References in notes

54M. Hataba, A. El-Mahdy, E. Rohou.

OJIT: A Novel Obfuscation Approach Using Standard Just-In-Time Compiler Transformations, in: International Workshop on Dynamic Compilation Everywhere, January 2015.
55R. Kumar, D. M. Tullsen, N. P. Jouppi, P. Ranganathan.

Heterogeneous chip multiprocessors, in: IEEE Computer, nov. 2005, vol. 38, n^o 11, pp. 32–38.
56S. Nassif, N. Mehta, Y. Cao.

A resilience roadmap, in: Design, Automation Test in Europe Conference Exhibition (DATE), 2010, March 2010, pp. 1011-1016.
57J. Nickolls, W. J. Dally.

The GPU computing era, in: Micro, IEEE, 2010, vol. 30, n^o 2, pp. 56–69.
58R. Omar, A. El-Mahdy, E. Rohou.

Arbitrary control-flow embedding into multiple threads for obfuscation: a preliminary complexity and performance analysis, in: Proceedings of the 2nd international workshop on Security in cloud computing, ACM, 2014, pp. 51–58.
59A. Seznec, N. Sendrier.

HAVEGE: A user-level software heuristic for generating empirically strong random numbers, in: ACM Transactions on Modeling and Computer Simulation (TOMACS), 2003, vol. 13, n^o 4, pp. 334–346.
60H. Wong, T. M. Aamodt.

The Performance Potential for Single Application Heterogeneous Systems, in: 8th Workshop on Duplicating, Deconstructing, and Debunking, 2009.

Previous |

Home