DS5220: Supervised Machine Learning and Learning Theory

Course overview

This course is designed to introduce students to the field of machine learning, an essential toolset for making sense of the vast and complex datasets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years.

This class will present a number of important modeling and prediction techniques that are staples in the fields of machine learning, artificial intelligence, and data science. In addition, this course will cover the statistical underpinnings of the methodology. The tentative list of topics includes:

Regression and classification as a predictive task and general model fitting: a review of linear regression, cross-validation, bootstrapping, sparse regression (Ridge, LASSO), tree-based methods.
Neural networks and deep learning: convolutional neural networks, backpropagation, transformer neural networks, language modeling.
Causality, reasoning, inference: potential outcomes, inverse propensity weighting, matching, difference-in-difference.
Unsupervised learning: Principal component analysis, clustering.

Course syllabus

Week 1, Sep 6: Introduction

Examples about linear regression, image classification and language modeling (naive Bayes).

Week 2, Sep 10: Linear regression and estimation, Sep 13: Bias-variance tradeoff; K-nearest neighbors

Simple linear regression, multiple linear regression, with some review of linear algebra.
The bias-variance tradeoff in supervised learning, learning polynomial functions, and K-nearest neighbhors.

Week 3, Sep 17: Logistic regression and linear discriminant analysis, Sep 20: LDA and QDA

Logistic function, logistic loss (log-loss), maximum likelihood estimation, and cross-entropy loss. See notes on logistic regression using gradient descent.
Mixture of Gaussians, estimation of linear discriminant analysis (LDA).
Quadratic discriminant analysis, estimation of QDA.
Logisitic regression vs. LDA vs. QDA.

Week 4, Sep 24: Cross validation, bootstrap, and subset selection, Sep 27: Ridge regression and LASSO