Computational Statistics and Optimization

Syllabus

Ce cours a pour but de décrire les outils mathématiques nécessaires à l’optimisation de modèles d’apprentissage statistique (étape dite d'”entraînement” du modèle). Il donnera les bases mathématiques de l’optimisation convexe, et décrira les différentes approches utilisées pour la construction d’algorithme d’optimisation convexes efficaces : algorithmes du premier et second ordre batch, algorithmes stochastique et algorithmes distribués. Des séances de TP permettront la mise en oeuvre de ces algorithmes sur des données réelles.

Plan du cours

Partie 1 [21h cours + 3h TP]

Introduction : modèles de base, Analyse convexe de base : sous-gradient, Fenchel, dualité de Lagrange, gap de dualité, KKT [12h cours, 2/10, 9/10, 16/10, 23/10 : 10h15 - 13h30, salle PC18 à l’X, enseignant : Anne Sabourin]
- Course 1: Introduction – projection – separating hyperplanes – subdifferential
- Course 2: Subdifferentials – lsc functions – Fenchel-Legendre transform – biconjugation
- Course 3: Subdifferential calculus rules, Introduction to Lagrangian duality
- Course material (updated 18/10/25) :
  - Mixed version ( French, then English) : LnotesCvxAn_FrenchEnglish
  - English version : Lnotes_CvxAn_FullEn

Méthodes de 1er ordre batch : gradient (gradient, conjugué, coordinate descent, line search, accélération Nesterov) [3h cours, 27/10, 9h – 12h15, salle B551 à Telecom ParisTech (rue Barrault), enseignant : Joseph Salmon [slides]
Méthodes proximales : motivation des pénalités / overfitting, exemples d’opérateurs, Fista. [6h cours + 3h TP, enseignants : Joseph Salmon et Stéphane Gaïffas]
- 6 novembre 10h15 – 13h30, salle PC18 à l’X [slides]
- 13 novembre 10h15 – 13h30, salle PC18 à l’X [slides]
- 21 novembre 14h – 17h, salle C128 à Telecom, séance de TP [notebook]

Validation d’acquis 1 : examen écrit

27 novembre 10h15 – 13h30, amphithéâtre Grenat à Telecom

Partie 2 [21h cours]

Practical optimization : optimizing with the intercept, solving ridge regression with dense or sparse data, in the primal or in the dual, conjugate gradient and the use of warm start [3h cours, 4/12, 10h15 - 13h30, salle PC18 à l’X, enseignant : Alexandre Gramfort] [slides + notes_cq + notebooks.zip]
Second-order batch methods : from Newton, to variable metric, quasi-Newton methods (BFGS, L-BGFS), Gauss-Newton and Levenberg-Marquardt [3h cours, 11/12, 10h15 - 13h30, salle PC18 à l’X, enseignant : Alexandre Gramfort] [notes_quasi_newton.pdf]
Algorithmes stochastiques : SGD et versions plus récentes [6h cours, 18/12, 8/01, 10h15 - 13h30, salle PC18 à l’X, enseignant : Stéphane Gaïffas], slides partie 1
Algorithmes primo-duaux : compléments sur la dualité. ADMM, optimisation distribuée [9h cours, 14/01, 22/01, 29/01, 10h15 - 13h30, salle PC18 à l’X, enseignant : Pascal Bianchi]
Validation finale du cours : examen écrit [5/02, 10h15 - 13h30, salle PC18 à l’X]

Projet en binôme

Le sujet et les données sont disponibles dans ce notebook ipython : projet.ipynb

Evaluation

Examen écrit pour la partie 1 le 27/11 + comptes rendus de TP à faire en binôme
Examen écrit pour la partie 2 le 5/02 + Devoir à la maison d’implémentation

Pré-requis

Résolution de systèmes linéaires
Optimisation sous contraintes

Réferences bibliographiques

Livres de référence

Boyd and Vandenberghe, Convex optimization, Cambridge university press (2009)
Bauschke and Combettes, Convex analysis and monotone operator theory in Hilbert spaces, Springer (2011)
Hiriart-Urruty and Lemaréchal, Convex Analysis and Minimization Algorithms, Springer (1996)
Borwein and Lewis, Convex analysis and nonlinear optimization: theory and examples, Springer (2010)

Articles d’état de l’art

Boyd et al., Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Fondation and Trends in Machine Learning (2010)
Boyd and Parikh, Proximal Algorithms, Fondation and Trends in Optimization (2013)
Bach et al, Optimization with Sparsity-Inducing Penalties, Fondation and Trends in Machine Learning (2011)

Data Sciences in the Mathematics with Applications track

(former - Mathbigdata)

Computational Statistics and Optimisation