DS5220: Supervised Machine Learning and Learning Theory

Course overview

This course is designed to introduce students to the field of machine learning, an essential toolset for making sense of the vast and complex datasets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years.

This class will present a number of important modeling and prediction techniques that are staples in the fields of machine learning, artificial intelligence, and data science. In addition, this course will cover the statistical underpinnings of the methodology. The tentative list of topics includes:

Course syllabus

Week 1, Sep 6: Introduction

Week 2, Sep 10: Linear regression and estimation, Sep 13: Bias-variance tradeoff; K-nearest neighbors

Week 3, Sep 17: Logistic regression and linear discriminant analysis, Sep 20: LDA and QDA

Week 4, Sep 24: Cross validation, bootstrap, and subset selection, Sep 27: Ridge regression and LASSO

Week 5, Oct 1: Decision trees

Week 6, Oct 8: Random forests, boosting, Oct 11: Introduction to neural networks

Week 7, Oct 15: Convolutional neural networks, Oct 18: Implementations of neural networks in PyTorch

Week 8, Oct 22: Backpropagation, Oct 25: Backpropagation derivation (Vanishing gradients)

Week 9: Class project proposal presentations.

Week 10, Nov 5: Midterm review

Week 11, Nov 12: Foundation models I, Nov 15: Foundation models II

Week 12, Nov 19: Dimension reduction I, Nov 22: Dimension reduction II

Week 13, Nov 26: A gentle introduction to cause and effect

Week 14, Dec 3: Introduction to probabilistic diffusion models / Conclusion

Course information

You are responsible for keeping up with all announcements made in class and for all changes in the schedule that are posted on the Canvas website.

The grade will be based on the following:

Textbooks for reference: