Archive 2019
RequirementsOptimisation M1. Statistique M1
Program requirementsCC+examen
TeacherStéphane Gaïffas
Weekly hours 3 h CM , 2 h TD
Years M2 Data Science (ouverture 2020)

Syllabus

Présentation des méthodes d'apprentissage supervisé et non-supervisé: des modèles génératifs aux réseaux de neurones en passant par les techniques arborescentes.

Contents

  1. Introduction to supervised learning (3 weeks). Binary classification, standard metrics and recipes (overfitting, cross-validation) and regression, LDA / QDA, Logistic regression, Generalized Linear Models, Regularization (Ridge, Lasso), Support Vector Machine, the Hinge loss. Kernel methods. Decision trees, CART, Boosting.
  2. Optimization for Machine Learning (2 weeks). Proximal gradient descent, Coordinate descent / coordinate gradient descent, Quasi-newton methods, Stochastic gradient descent and beyond
  3. Neural Networks (2 weeks). Introduction to neural networks. The perceptron, multilayer neural networks, deep learning. Adaptive-rate stochastic gradient descent, back-propagation. Convolutional neural networks
  4. Unsupervised learning (2 sessions). Gaussian mixtures and EM, Matrix Factorization, Non-negative Matrix Factorization, Factorization machines, Embeddings methods

Bibliography

  • Murphy, K.M. (2012). Machine Learning. MIT Press.
  • Mohri, M., Rostamizadeh, A., and Talwalkar, A. (2012). Foundations of Machine Learning. MIT Press.
  • Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.
  • McKinney, W. (2012). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly.
  • Bühlmann, P., and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer-Verlag.