Bibliography

Publications of the year

Doctoral Dissertations and Habilitation Theses

1A. Sani.

Machine Learning for Decision Making, Université de Lille 1, May 2015.

https://tel.archives-ouvertes.fr/tel-01256178
2M. Soare.

Sequential Resource Allocation in Linear Stochastic Bandits , Université Lille 1 - Sciences et Technologies, December 2015.

https://hal.archives-ouvertes.fr/tel-01249224

Articles in International Peer-Reviewed Journals

3T. Collet, O. Pietquin.

Optimism in Active Learning, in: Computational Intelligence and Neuroscience, August 2015.

https://hal.inria.fr/hal-01225798
4L. Devroye, G. Lugosi, G. Neu.

Random-Walk Perturbations for Online Combinatorial Optimization, in: IEEE Transactions on Information Theory, June 2015, vol. 61, n^o 7, pp. 4099 - 4106. [ DOI : 10.1109/TIT.2015.2428253 ]

https://hal.inria.fr/hal-01214987
5N. Gatti, A. Lazaric, M. Rocco, F. Trovò.

Truthful Learning Mechanisms for Multi–Slot Sponsored Search Auctions with Externalities, in: Artificial Intelligence, October 2015, vol. 227, pp. 93-139.

https://hal.inria.fr/hal-01237670
6H. Kadri, E. Duflos, P. Preux, S. Canu, A. Rakotomamonjy, J. Audiffren.

Operator-valued Kernels for Learning from Functional Response Data, in: Journal of Machine Learning Research (JMLR), 2015.

https://hal.archives-ouvertes.fr/hal-01221329
7A. Khaleghi, D. Ryabko.

Nonparametric multiple change point estimation in highly dependent time series, in: Theoretical Computer Science, November 2015. [ DOI : 10.1016/j.tcs.2015.10.041 ]

https://hal.inria.fr/hal-01235330
8B. Scherrer, M. Ghavamzadeh, V. Gabillon, B. Lesner, M. Geist.

Approximate Modified Policy Iteration and its Application to the Game of Tetris, in: Journal of Machine Learning Research, 2015, vol. 16, 1629−1676 p, A paraître.

https://hal.inria.fr/hal-01091341

International Conferences with Proceedings

9J. Audiffren, M. Valko, A. Lazaric, M. Ghavamzadeh.

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning, in: International Joint Conference on Artificial Intelligence, Bueons Aires, Argentina, July 2015.

https://hal.inria.fr/hal-01146187
10M. Barlier, J. Perolat, R. Laroche, O. Pietquin.

Human-Machine Dialogue as a Stochastic Game, in: 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL 2015), Prague, Czech Republic, September 2015.

https://hal.inria.fr/hal-01225848
11A. Carpentier, M. Valko.

Simple regret for infinitely many armed bandits, in: International Conference on Machine Learning, Lille, France, July 2015.

https://hal.inria.fr/hal-01153538
12J. Chemali, A. Lazaric.

Direct Policy Iteration with Demonstrations, in: IJCAI - 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, July 2015.

https://hal.inria.fr/hal-01237659
13T. Collet, O. Pietquin.

Bayesian Credible Intervals for Online and Active Learning of Classification Trees, in: ADPRL 2015 - Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Cape Town, South Africa, Proceedings of the Symposium Series on Computational Intelligence, IEEE, December 2015.

https://hal.inria.fr/hal-01225850
14T. Collet, O. Pietquin.

Optimism in Active Learning with Gaussian Processes, in: 22nd International Conference on Neural Information Processing (ICONIP2015), Istanbul, Turkey, November 2015.

https://hal.inria.fr/hal-01225826
15B. Derbel, P. Preux.

Simultaneous Optimistic Optimization on the Noiseless BBOB Testbed, in: The 17th IEEE Congress on Evolutionary Computation (CEC), Sendai, Japan, May 2015.

https://hal.inria.fr/hal-01246420
16C. Dhanjal, R. Gaudel, S. Clémençon.

Collaborative Filtering with Localised Ranking, in: Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI'15), Austin, United States, January 2015, 7 p.

https://hal.inria.fr/hal-01255890
17H. Glaude, C. Enderli, J.-F. Grandin, O. Pietquin.

Learning of scanning strategies for electronic support using predictive state representations, in: International Workshop on Machine Learning for Signal Processing (MLSP 2015), Boston, United States, September 2015.

https://hal.inria.fr/hal-01225807
18H. Glaude, C. Enderli, O. Pietquin.

Non-negative Spectral Learning for Linear Sequential Systems, in: 22nd International Conference on Neural Information Processing (ICONIP2015), Istanbul, Turkey, November 2015.

https://hal.inria.fr/hal-01225838
19H. Glaude, C. Enderli, O. Pietquin.

Spectral learning with proper probabilities for finite state automation, in: ASRU 2015 - Automatic Speech Recognition and Understanding Workshop, Scottsdale, United States, Proceedings of the Automatic Speech Recognition and Understanding Workshop, IEEE, December 2015.

https://hal.inria.fr/hal-01225810
20J.-B. Grill, M. Valko, R. Munos.

Black-box optimization of noisy functions with unknown smoothness, in: Neural Information Processing Systems, Montréal, Canada, December 2015.

https://hal.inria.fr/hal-01222915
21M. K. H. Hanawal, V. Saligrama, M. Valko, R. Munos.

Cheap Bandits, in: International Conference on Machine Learning, Lille, France, 2015.

https://hal.inria.fr/hal-01153540
22K. Lakshmanan, R. Ortner, D. Ryabko.

Improved Regret Bounds for Undiscounted Continuous Reinforcement Learning, in: International Conference on Machine Learning (ICML), Lille, France, July 2015.

https://hal.inria.fr/hal-01165966
23J. Mary, R. Gaudel, P. Preux.

Bandits and Recommender Systems, in: First International Workshop on Machine Learning, Optimization, and Big Data (MOD'15), Taormina, Italy, Lecture Notes in Computer Science, Springer International Publishing, July 2015, vol. 9432, pp. 325-336. [ DOI : 10.1007/978-3-319-27926-8_29 ]

https://hal.inria.fr/hal-01256033
24T. Munzer, B. Piot, M. Geist, O. Pietquin, M. Lopes.

Inverse Reinforcement Learning in Relational Domains, in: International Joint Conferences on Artificial Intelligence, Buenos Aires, Argentina, July 2015.

https://hal.archives-ouvertes.fr/hal-01154650
25V. Musco, M. Monperrus, P. Preux.

An Experimental Protocol for Analyzing the Accuracy of Software Error Impact Analysis, in: Tenth IEEE/ACM International Workshop on Automation of Software Test, Florence, Italy, May 2015.

https://hal.inria.fr/hal-01120913
26G. Neu.

Explore no more: Improved high-probability regret bounds for non-stochastic bandits, in: Advances on Neural Information Processing Systems 28 (NIPS 2015), Montreal, Canada, December 2015, pp. 3150-3158.

https://hal.inria.fr/hal-01223501
27G. Neu.

First-order regret bounds for combinatorial semi-bandits, in: Proceedings of the 28th Annual Conference on Learning Theory (COLT), Paris, France, JMLR Workshop and Conference Proceedings, July 2015, vol. 40, pp. 1360-1375.

https://hal.inria.fr/hal-01215001
28J. Perolat, B. Scherrer, B. Piot, O. Pietquin.

Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games, in: International Conference on Machine Learning (ICML 2015), Lille, France, July 2015.

https://hal.inria.fr/hal-01153270
29B. Piot, M. Geist, O. Pietquin.

Imitation Learning Applied to Embodied Conversational Agents, in: 4th Workshop on Machine Learning for Interactive Systems (MLIS 2015), Lille, France, J. Workshop, C. Proceedings (editors), July 2015, vol. 43.

https://hal.inria.fr/hal-01225816
30D. Ryabko, B. Ryabko.

Predicting the outcomes of every process for which an asymptotically accurate stationary predictor exists is impossible, in: International Symposium on Information Theory, Hong Kong, Hong Kong SAR China, IEEE, June 2015, pp. 1204-1206.

https://hal.inria.fr/hal-01165876
31A. Sani, A. Lazaric, D. Ryabko.

The Replacement Bootstrap for Dependent Data, in: Proceedings of the IEEE International Symposium on Information Theory, Hong Kong, Hong Kong SAR China, June 2015.

https://hal.inria.fr/hal-01144547
32B. Szorenyi, R. Busa-Fekete, P. Weng, E. Hüllermeier.

Qualitative Multi-Armed Bandits: A Quantile-Based Approach, in: Proceedings of The 32nd International Conference on Machine Learning, pp. 1660–1668, 2015, Lille, France, July 2015.

https://hal.inria.fr/hal-01204708
33A. C. Y. Tossou, C. Dimitrakakis.

Algorithms for Differentially Private Multi-Armed Bandits, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.

https://hal.inria.fr/hal-01234427
34Z. Zhang, B. Rubinstein, C. Dimitrakakis.

On the Differential Privacy of Bayesian Inference, in: AAAI 2016, Phoenix, Arizona, United States, February 2016.

https://hal.inria.fr/hal-01234215

Conferences without Proceedings

35F. Guillou, R. Gaudel, P. Preux.

Collaborative Filtering as a Multi-Armed Bandit, in: NIPS'15 Workshop: Machine Learning for eCommerce, Montréal, Canada, December 2015.

https://hal.inria.fr/hal-01256254
36F. Strub, J. Mary.

Collaborative Filtering with Stacked Denoising AutoEncoders and Sparse Inputs, in: NIPS Workshop on Machine Learning for eCommerce, Montreal, Canada, December 2015.

https://hal.inria.fr/hal-01256422

Scientific Books (or Scientific Book chapters)

37A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits, 2015, vol. 37, pp. 218–227.

https://hal.inria.fr/hal-01225614

Scientific Popularization

38P. Philippe, M. Tommasi, T. Vieville, C. De La Higuera.

L’apprentissage automatique : le diable n’est pas dans l’algorithme, June 2015, Article sur http://binaire.blog.lemonde.fr.

https://hal.inria.fr/hal-01246178

Other Publications

39C. Dhanjal, R. Gaudel, S. Clemencon.

AUC Optimisation and Collaborative Filtering, August 2015, working paper or preprint.

https://hal.archives-ouvertes.fr/hal-01185836
40E. Kaufmann.

On Bayesian index policies for sequential resource allocation, January 2016, working paper or preprint.

https://hal.archives-ouvertes.fr/hal-01251606
41V. Musco, A. Carette, M. Monperrus, P. Preux.

A Learning Algorithm for Change Impact Prediction: Experimentation on 7 Java Applications, December 2015, working paper or preprint.

https://hal.archives-ouvertes.fr/hal-01248241

References in notes

42P. Auer, N. Cesa-Bianchi, P. Fischer.

Finite-time analysis of the multi-armed bandit problem, in: Machine Learning, 2002, vol. 47, n^o 2/3, pp. 235–256.
43R. Bellman.

Dynamic Programming, Princeton University Press, 1957.
44D. Bertsekas, S. Shreve.

Stochastic Optimal Control (The Discrete Time Case), Academic Press, New York, 1978.
45D. Bertsekas, J. Tsitsiklis.

Neuro-Dynamic Programming, Athena Scientific, 1996.
46M. Puterman.

Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley and Sons, 1994.
47H. Robbins.

Some aspects of the sequential design of experiments, in: Bull. Amer. Math. Soc., 1952, vol. 55, pp. 527–535.
48R. Sutton, A. Barto.

Reinforcement learning: an introduction, MIT Press, 1998.
49P. Werbos.

ADP: Goals, Opportunities and Principles, IEEE Press, 2004, pp. 3–44, Handbook of learning and approximate dynamic programming.

Previous |

Home