Course on Reinforcement Learning


Lecture 0: Introduction to the Course


Introduction to the models and mathematical tools used in formalizing the problem of learning and decision-making under uncertainty. In particular, we will focus on the frameworks of reinforcement learning and multi-arm bandit. The main topics studied during the course are:

-Historical multi-disciplinary basis of reinforcement learning

-Markov decision processes and dynamic programming

-Stochastic approximation and Monte-Carlo methods

-Function approximation and statistical learning theory

-Approximate dynamic programming

-Introduction to stochastic and adversarial multi-arm bandit

-Learning rates and finite-sample analysis

Where and When

The course on “Reinforcement Learning” will be held at the Department of Mathematics at ENS Cachan. The course will be held every Tuesday from September 29th to December 15th from 11:00 to 13:00.


  1. 04/10 -- Markov Decision Processes [Salle Condorcet, d’Alembert]

  2. 11/10 -- Dynamic Programming [Salle Condorcet, d’Alembert]

  3. 18/10 -- Reinforcement Learning [Salle Condorcet, d’Alembert]

  4. 25/10 -- Practical session on Dynamic Programming and Reinforcement Learning [Salle Condorcet, d’Alembert]

  5. 08/11 -- Multi-armed Bandit (1) [Salle Condorcet, d’Alembert]

  6. 15/11 -- Practical session on Multi-armed Bandit [Amphi Curie, d’Alembert]

  7. 22/11 -- Multi-armed Bandit (2) [Salle Condorcet, d’Alembert]

  8. 29/11 -- Practical session on ADP [Salle Condorcet, d’Alembert]

  9. 13/12 -- Approximate Dynamic Programming AND Practical session on ADP [Salle Condorcet, d’Alembert]

  1. 10/01/2017 -- Deadline for submission proposals

  2. 17/01/2017 -- Presentations


The course will be evaluated according to the points collected in the practical sessions and with a final project. Project proposals, internships, and PhD positions will be announced towards mid-November.


** Slides will be uploaded before each class (otherwise see slides from last year).

** The lecture notes are a bit outdated now, if you want to look at them refer to the material from last year.


  1. Project proposals are available at

  2. Report submission deadline: 10 January 2017

  3. Presentation day: 17 January 2017

  4. Evaluation: 2 points per TP and 12 points for the project

Lecture 1: A Bit of History

Lecture 2: MDP and Dynamic Programming

Lecture 3: Reinforcement Learning Algorithms

Lecture 4: The Multi-Armed Bandit Framework

Lecture 5: Approximate Dynamic Programming