CS7020 - Advances in Theory of Deep Learning

Course Data :

Description : To gain a theoretical understanding of why and when several components of deep neural networks will succeed or fail. By the end of the course, the students should largely be able to give reasonable arguments for the success or failure of any deep neural network architecture for a given task, and be able to design hypotheses and experiments designed to test the hypotheses.

Course Content : Prerequisites: Deep Learning Basics 1. Architectures: Multi layer perceptrons, convolutional neural networks and recurrent neural networks. 2. Tasks: Classification, Regression, Detection, Structured Prediction, data generation and embedding. Part I: Representation Power of Deep Neural Networks 1. Classic universal approximation. 2. Representing complex functions with many layers. 3. Convolutional layers and scattering filters. Part II: Optimising/Learning Deep Neural Networks 1. Reparameterisation and normalisation approaches. 2. Vanishing and exploding gradients. 3. Effect of weight sharing and convolutional layers in optimisation. 4. Variants of Gradient Descent. Part III: Generalisation of Deep Neural Networks 1. Gradient descent as a “regulariser” 2. Normalised margin approaches. Part IV: Special Layers and their benefits 1. Residual layers 2. Dropout 3. Attention mechanisms Part V: Other Paradigms for Deep Networks 1. Deep linear networks 2. Information Bottlenecks 3. Sum-product networks

Text Books : Ian Goodfellow, Yoshua Bengio and Aaron Courville. Deep learning. MIT Press. 2016.

Reference Books : Recent papers from NIPS, ICML, JMLR, Neural computation, Machine learning Journal, ICLR and Arxiv.

Prerequisite : CS5691 AND CS7015


Pre-Requisites

    None

Parameters

Credits Type Date of Introduction
3-0-0-0-9-12 Elective Jul 2019

Previous Instances of the Course


© 2016 - All Rights Reserved - Dept of CSE, IIT Madras
Website Credits