Bibliography
Major publications by the team in recent years
-
1O. Cappé, A. Garivier, O.-A. Maillard, R. Munos, G. Stoltz.
Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation, in: Annals of Statistics, 2013, vol. 41, no 3, pp. 1516-1541, Accepted, to appear in Annals of Statistics.
https://hal.archives-ouvertes.fr/hal-00738209 -
2A. Carpentier, M. Valko.
Revealing graph bandits for maximizing local influence, in: International Conference on Artificial Intelligence and Statistics, Seville, Spain, May 2016.
https://hal.inria.fr/hal-01304020 -
3N. Gatti, A. Lazaric, M. Rocco, F. Trovò.
Truthful Learning Mechanisms for Multi–Slot Sponsored Search Auctions with Externalities, in: Artificial Intelligence, October 2015, vol. 227, pp. 93-139.
https://hal.inria.fr/hal-01237670 -
4M. Ghavamzadeh, Y. Engel, M. Valko.
Bayesian Policy Gradient and Actor-Critic Algorithms, in: Journal of Machine Learning Research, January 2016, vol. 17, no 66, pp. 1-53.
https://hal.inria.fr/hal-00776608 -
5H. Kadri, E. Duflos, P. Preux, S. Canu, A. Rakotomamonjy, J. Audiffren.
Operator-valued Kernels for Learning from Functional Response Data, in: Journal of Machine Learning Research (JMLR), 2016.
https://hal.archives-ouvertes.fr/hal-01221329 -
6E. Kaufmann, O. Cappé, A. Garivier.
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models, in: Journal of Machine Learning Research, January 2016, vol. 17, pp. 1-42.
https://hal.archives-ouvertes.fr/hal-01024894 -
7A. Lazaric, M. Ghavamzadeh, R. Munos.
Analysis of Classification-based Policy Iteration Algorithms, in: Journal of Machine Learning Research, 2016, vol. 17, pp. 1 - 30.
https://hal.inria.fr/hal-01401513 -
8R. Munos.
From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning, 2014, 130 pages.
https://hal.archives-ouvertes.fr/hal-00747575 -
9R. Ortner, D. Ryabko, P. Auer, R. Munos.
Regret bounds for restless Markov bandits, in: Journal of Theoretical Computer Science (TCS), 2014, vol. 558, pp. 62-76. [ DOI : 10.1016/j.tcs.2014.09.026 ]
https://hal.inria.fr/hal-01074077 -
10D. Ryabko, J. Mary.
A Binary-Classification-Based Metric between Time-Series Distributions and Its Use in Statistical and Learning Problems, in: Journal of Machine Learning Research, 2013, vol. 14, pp. 2837-2856.
https://hal.inria.fr/hal-00913240
Doctoral Dissertations and Habilitation Theses
-
11H. Glaude.
Learning rational linear sequential systems using the method of moments, Université de Lille 1 - Sciences et Technologies, July 2016.
https://tel.archives-ouvertes.fr/tel-01374080 -
12F. Guillou.
On Recommendation Systems in a Sequential Context, Université Lille 3, December 2016.
https://tel.archives-ouvertes.fr/tel-01407336 -
13V. Musco.
Propagation Analysis based on Software Graphs and Synthetic Data, Université Lille 3, November 2016.
https://tel.archives-ouvertes.fr/tel-01398903 -
14M. Valko.
Bandits on graphs and structures, École normale supérieure de Cachan - ENS Cachan, June 2016, Habilitation à diriger des recherches.
https://hal.inria.fr/tel-01359757
Articles in International Peer-Reviewed Journals
-
15M. Ghavamzadeh, Y. Engel, M. Valko.
Bayesian Policy Gradient and Actor-Critic Algorithms, in: Journal of Machine Learning Research, January 2016, vol. 17, no 66, pp. 1-53.
https://hal.inria.fr/hal-00776608 -
16H. Kadri, E. Duflos, P. Preux, S. Canu, A. Rakotomamonjy, J. Audiffren.
Operator-valued Kernels for Learning from Functional Response Data, in: Journal of Machine Learning Research (JMLR), 2016.
https://hal.archives-ouvertes.fr/hal-01221329 -
17E. Kaufmann, O. Cappé, A. Garivier.
On the Complexity of Best Arm Identification in Multi-Armed Bandit Models, in: Journal of Machine Learning Research, January 2016, vol. 17, pp. 1-42.
https://hal.archives-ouvertes.fr/hal-01024894 -
18A. Khaleghi, D. Ryabko.
Nonparametric multiple change point estimation in highly dependent time series, in: Theoretical Computer Science, 2016, vol. 620, pp. 119-133. [ DOI : 10.1016/j.tcs.2015.10.041 ]
https://hal.inria.fr/hal-01235330 -
19A. Khaleghi, D. Ryabko, J. Mary, P. Preux.
Consistent Algorithms for Clustering Time Series, in: Journal of Machine Learning Research, 2016, vol. 17, no 3, pp. 1 - 32.
https://hal.inria.fr/hal-01399613 -
20A. Lazaric, M. Ghavamzadeh, R. Munos.
Analysis of Classification-based Policy Iteration Algorithms, in: Journal of Machine Learning Research, 2016, vol. 17, pp. 1 - 30.
https://hal.inria.fr/hal-01401513 -
21V. Musco, M. Monperrus, P. Preux.
A Large-scale Study of Call Graph-based Impact Prediction using Mutation Testing, in: Software Quality Journal, 2016. [ DOI : 10.1007/s11219-016-9332-8 ]
https://hal.inria.fr/hal-01346046 -
22G. Neu, B. Gábor.
Importance Weighting Without Importance Weights: An Efficient Algorithm for Combinatorial Semi-Bandits, in: Journal of Machine Learning Research, August 2016, vol. 17, no 154, pp. 1 - 21.
https://hal.archives-ouvertes.fr/hal-01380278
International Conferences with Proceedings
-
23K. Azizzadenesheli, A. Lazaric, A. Anandkumar.
Reinforcement Learning of POMDPs using Spectral Methods, in: Proceedings of the 29th Annual Conference on Learning Theory (COLT2016), New York City, United States, June 2016.
https://hal.inria.fr/hal-01322207 -
24M. Barlier, R. Laroche, O. Pietquin.
A Stochastic Model for Computer-Aided Human-Human Dialogue, in: Interspeech 2016, San Francisco, United States, September 2016, vol. 2016, pp. 2051 - 2055.
https://hal.inria.fr/hal-01406894 -
25M. Barlier, R. Laroche, O. Pietquin.
Learning Dialogue Dynamics with the Method of Moments, in: Workshop on Spoken Language Technologie (SLT 2016), San Diego, United States, December 2016.
https://hal.inria.fr/hal-01406904 -
26D. Calandriello, A. Lazaric, M. Valko.
Analysis of Nyström method with sequential ridge leverage score sampling, in: Uncertainty in Artificial Intelligence, New York City, United States, June 2016.
https://hal.inria.fr/hal-01343674 -
27A. Carpentier, M. Valko.
Revealing graph bandits for maximizing local influence, in: International Conference on Artificial Intelligence and Statistics, Seville, Spain, May 2016.
https://hal.inria.fr/hal-01304020 -
28L. El Asri, R. Laroche, O. Pietquin.
Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory, in: 7th International Workshop on Spoken Dialogue Systems (IWSDS 2016), Saariselka, Finland, January 2016.
https://hal.inria.fr/hal-01406873 -
29L. El Asri, B. Piot, M. Geist, R. Laroche, O. Pietquin.
Score-based Inverse Reinforcement Learning, in: International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), Singapore, Singapore, May 2016.
https://hal.inria.fr/hal-01406886 -
30A. Erraqabi, M. Valko, A. Carpentier, O.-A. Maillard.
Pliable rejection sampling, in: International Conference on Machine Learning, New York City, United States, June 2016.
https://hal.inria.fr/hal-01322168 -
31C. Z. Felício, K. V. R. Paixão, C. A. Z. Barcelos, P. Preux.
Preference-like Score to Cope with Cold-Start User in Recommender Systems, in: 28th International Conference on Tools with Artificial Intelligence (ICTAI), San Jose, United States, Proceedings of the IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), November 2016.
https://hal.inria.fr/hal-01390762 -
32V. Gabillon, A. Lazaric, M. Ghavamzadeh, R. Ortner, P. Bartlett.
Improved Learning Complexity in Combinatorial Pure Exploration Bandits, in: Proceedings of the 19th International Conference on Artificial Intelligence (AISTATS), Cadiz, Spain, May 2016.
https://hal.inria.fr/hal-01322198 -
33A. Garivier, E. Kaufmann.
Optimal Best Arm Identification with Fixed Confidence, in: 29th Annual Conference on Learning Theory (COLT), New York, United States, JMLR Workshop and Conference Proceedings, June 2016, vol. 49.
https://hal.archives-ouvertes.fr/hal-01273838 -
34A. Garivier, E. Kaufmann, W. M. Koolen.
Maximin Action Identification: A New Bandit Framework for Games, in: 29th Annual Conference on Learning Theory (COLT), New-York, United States, JMLR Workshop and Conference Proceedings, June 2016, vol. 49.
https://hal.archives-ouvertes.fr/hal-01273842 -
35A. Garivier, E. Kaufmann, T. Lattimore.
On Explore-Then-Commit Strategies, in: NIPS, Barcelona, Spain, Advances in Neural Information Processing Systems (NIPS), December 2016, vol. 29.
https://hal.archives-ouvertes.fr/hal-01322906 -
36H. Glaude, O. Pietquin.
PAC learning of Probabilistic Automaton based on the Method of Moments, in: International Conference on Machine Learning (ICML 2016), New York, United States, June 2016.
https://hal.inria.fr/hal-01406889 -
37J.-B. Grill, M. Valko, R. Munos.
Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning, in: NIPS 2016 - Thirtieth Annual Conference on Neural Information Processing Systems, Barcelona, Spain, December 2016.
https://hal.inria.fr/hal-01389107 -
38F. Guillou, R. Gaudel, P. Preux.
Large-scale Bandit Recommender System, in: the 2nd International Workshop on Machine Learning, Optimization and Big Data (MOD'16), Volterra, Italy, August 2016.
https://hal.inria.fr/hal-01406389 -
39F. Guillou, R. Gaudel, P. Preux.
Scalable explore-exploit Collaborative Filtering, in: Pacific Asia Conference on Information Systems (PACIS'16), Chiayi, Taiwan, 2016.
https://hal.inria.fr/hal-01406418 -
40F. Guillou, R. Gaudel, P. Preux.
Sequential Collaborative Ranking Using (No-)Click Implicit Feedback, in: The 23rd International Conference on Neural Information Processing (ICONIP'16), Kyoto, Japan, Lecture Notes in Computer Science, October 2016, vol. 9948, pp. 288 - 296. [ DOI : 10.1007/978-3-319-46672-9_33 ]
https://hal.inria.fr/hal-01406338 -
41E. Kaufmann, T. Bonald, M. Lelarge.
A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks, in: ALT 2016 - Algorithmic Learning Theory, Bari, Italy, R. Ortner, H. U. Simon, S. Zilles (editors), Lecture Notes in Computer Science, Springer, October 2016, vol. 9925, pp. 355-370. [ DOI : 10.1007/978-3-319-46379-7_24 ]
https://hal.archives-ouvertes.fr/hal-01163147 -
42T. Kocák, G. Neu, M. Valko.
Online learning with Erdős-Rényi side-observation graphs, in: Uncertainty in Artificial Intelligence, New York City, United States, June 2016.
https://hal.inria.fr/hal-01320588 -
43T. Kocák, G. Neu, M. Valko.
Online learning with noisy side observations, in: International Conference on Artificial Intelligence and Statistics, Seville, Spain, May 2016.
https://hal.inria.fr/hal-01303377 -
44V. Musco, A. Carette, M. Monperrus, P. Preux.
A Learning Algorithm for Change Impact Prediction, in: 5th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering, Austin, United States, May 2016.
https://hal.inria.fr/hal-01279620 -
45V. Musco, M. Monperrus, P. Preux.
Mutation-Based Graph Inference for Fault Localization, in: International Working Conference on Source Code Analysis and Manipulation, Raleigh, United States, October 2016.
https://hal.inria.fr/hal-01350515 -
46J. Pérolat, B. Piot, M. Geist, B. Scherrer, O. Pietquin.
Softened Approximate Policy Iteration for Markov Games, in: ICML 2016 - 33rd International Conference on Machine Learning, New York City, United States, June 2016.
https://hal.inria.fr/hal-01393328 -
47J. Pérolat, B. Piot, B. Scherrer, O. Pietquin.
On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games, in: 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), Cadiz, Spain, Proceedings of the International Conference on Artificial Intelligences and Statistics, May 2016.
https://hal.inria.fr/hal-01291495 -
48D. Ryabko.
Things Bayes can't do, in: Proceedings of the 27th International Conference on Algorithmic Learning Theory (ALT'16), Bari, Italy, October 2016, vol. LNCS, no 9925, pp. 253-260. [ DOI : 10.1007/978-3-319-46379-7_17 ]
https://hal.inria.fr/hal-01380063 -
49F. Strub, R. Gaudel, J. Mary.
Hybrid Recommender System based on Autoencoders, in: the 1st Workshop on Deep Learning for Recommender Systems, Boston, United States, September 2016, pp. 11 - 16. [ DOI : 10.1145/2988450.2988456 ]
https://hal.inria.fr/hal-01336912 -
50A. C. Y. Tossou, C. Dimitrakakis.
Algorithms for Differentially Private Multi-Armed Bandits, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.
https://hal.inria.fr/hal-01234427 -
51Z. Zhang, B. Rubinstein, C. Dimitrakakis.
On the Differential Privacy of Bayesian Inference, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.
https://hal.inria.fr/hal-01234215
Conferences without Proceedings
-
52A. Bérard, C. Servan, O. Pietquin, L. Besacier.
MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP, in: The 10th edition of the Language Resources and Evaluation Conference (LREC), Portoroz, Slovenia, May 2016.
https://hal.archives-ouvertes.fr/hal-01335930 -
53F. Guillou, R. Gaudel, P. Preux.
Compromis exploration-exploitation pour système de recommandation à grande échelle, in: Conférence francophone sur l'Apprentissage Automatique (CAp'16), Marseille, France, July 2016.
https://hal.inria.fr/hal-01406439 -
54F. Strub, J. Mary, R. Gaudel.
Filtrage Collaboratif Hybride avec des Auto-encodeurs, in: Conférence francophone sur l'Apprentissage Automatique (CAp'16), Marseille, France, July 2016.
https://hal.inria.fr/hal-01406432
Internal Reports
-
55B. Danglot, P. Preux, B. Baudry, M. Monperrus.
Correctness Attraction: A Study of Stability of Software Behavior Under Runtime Perturbation, HAL, 2016, no hal-01378523.
https://hal.archives-ouvertes.fr/hal-01378523
Other Publications
-
56S. Bubeck, R. Eldan, J. Lehec.
Sampling from a log-concave distribution with Projected Langevin Monte Carlo, January 2017, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01428950 -
57C. Dimitrakakis, F. Jarboui, D. Parkes, L. Seeman.
Multi-view Sequential Games: The Helper-Agent Problem, December 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01408294 -
58E. Kaufmann.
On Bayesian index policies for sequential resource allocation, September 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01251606 -
59A. R. Luedtke, E. Kaufmann, A. Chambaz.
Asymptotically Optimal Algorithms for Multiple Play Bandits with Partial Feedback, June 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01338733 -
60F. Strub, J. Mary, R. Gaudel.
Hybrid Collaborative Filtering with Autoencoders, July 2016, working paper or preprint.
https://hal.archives-ouvertes.fr/hal-01281794
-
61P. Auer, N. Cesa-Bianchi, P. Fischer.
Finite-time analysis of the multi-armed bandit problem, in: Machine Learning, 2002, vol. 47, no 2/3, pp. 235–256. -
62R. Bellman.
Dynamic Programming, Princeton University Press, 1957. -
63D. Bertsekas, S. Shreve.
Stochastic Optimal Control (The Discrete Time Case), Academic Press, New York, 1978. -
64D. Bertsekas, J. Tsitsiklis.
Neuro-Dynamic Programming, Athena Scientific, 1996. -
65M. Puterman.
Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley and Sons, 1994. -
66H. Robbins.
Some aspects of the sequential design of experiments, in: Bull. Amer. Math. Soc., 1952, vol. 55, pp. 527–535. -
67R. Sutton, A. Barto.
Reinforcement learning: an introduction, MIT Press, 1998. -
68P. Werbos.
ADP: Goals, Opportunities and Principles, IEEE Press, 2004, pp. 3–44, Handbook of learning and approximate dynamic programming.