CS7180: Algorithmic and Statistical Aspects of Deep LearningCourse overviewDeep neural networks have been increasingly used in a wide range of applications, making significant strides in understanding data formats, ranging from images to texts and videos. This wave of study has raised numerous new theoretical and methodological questions that depart from our traditional understanding. This course will explore some of the recent work from the machine learning community that addresses these questions. There will be two main directions:
The first direction captures some of the challenges to understand the working of complex neural networks. Why do gradient-based methods work well for the highly non-convex loss functions of neural network models? How does a network trained with tens of millions of parameters generalize to unseen data despite using only an order of magnitude less labeled data? The second direction will explore emerging methods to address the challenge of acquiring labeled data for training neural networks. Such a labeling process is often both labor-intensive and costly (e.g. ImageNet). One approach is to learn from related tasks that have large amounts of labeled data available using multi-task and transfer learning. Another approach is to automate the labeling process through pre-specified domain knowledge using weak supervision or data augmentation. Prerequisite:
Logistics
Office hours
Grading: Three homework sets (40%), a research project (40%), presenting research papers (15%), attendence (5%). Annoucements
|