This course introduces statistical learning, which involves providing a theoretical underpinning and the foundation of machine learning (and artificial intelligence in general). This is a second course in machine learning, and we assume that you have already taken an introductory machine learning class (such as CS 6140 or DS 5220, DS 4400). The course will involve a mix of materials from different subjects such as learning theory, statistics, neural networks and deep learning, information theory, and reinforcement learning.
We will mostly draw on mathematical analysis to rigorously analyze the bahavior of machine learning models and algorithms (though we will emphasize their practical implications throughout the course).
Prerequisites
Students are expected to be familiar with basic calculus and linear algebra and comfortable reading and writing proofs.
Prior knowledge in probability and linear algebra.
Having taken an introductory machine learning class.
Week 1, Jan 6: Overview; Jan 8: Uniform convergence
What is the course about
Basic setup of supervised learning, empirical risk minimization, and uniform convergence
Basic setup of neural networks
Statistical transfer learning
Learning finite, realizable hypothesis spaces
Week 2, Jan 13: Concentration estimates, Jan 15: Rademacher complexity
Markov's inequality, Chebyshev's inequality, and Chernoff bound
Moment generating function
Sub-Gaussian random variables
Rademacher complexity (definition and properties)
Week 3, Jan 22: Examples of Rademacher complexity
Learning finite hypothesis classes
L2/L1 norm constrained hypothesis classes
Week 4, Jan 27: Matrix completion
Wrapping up the proof of Rademacher complexity-based generalization bound
Matrix completion
There will be three homeworks, for a total of 40% of overall grade. The homeworks should be done individually and submitted separately as well.
The course project includes an in-class presentation for 40% of total grade and a final course project for 20% of total grade.
There isn’t a single textbook that covers all of the lectures, though the following are good references for the course materials.
Statistical learning theory lecture notes, Percy Liang (Stanford)
Mathematical analysis of machine learning algorithms, Tong Zhang (UIUC)
Learning theory from first principles, Francis Bash (INRIA)