Mitesh Khapra and Pratyush Kumar

Mitesh and Pratyush are Assistant Professors at the Department of Computer Science and Engineering at IIT Madras. They have both industry and academic experience in working with deep learning and related areas. They are both passionate about teaching and contributing to nation building. Deep Learning Fee Structure

Driven by our passion for teaching and interest in nation building,
all PadhAI One courses are offered at very affordable prices.

 For students/faculty For professionals Students enrolled in schools/colleges and faculty members Working professionals and those looking to up-skill Applicants must provide a valid ID card indicating present affiliation. No pre-requisites Rs 1,000 + 18% GST for each course Rs 5,000 + 18% GST for each course

Course curriculum

• 1

Welcome

• Hello!

• 2

• Overview of the course

• FAQ on the course

• 3

• KickOff

• 4

Python Basics

• Basic: Basic Data Types

• Basic: List

• Basic: Tuple, Set, Dictionary

• Basic: Packages

• Basic: File Handling

• Basic: Class

• Basic: Numpy

• Basic: Plotting

• Feedback: Python Basics

• 5

Expert Systems - 6 Jars

• Expert Systems

• Say Hi To ML

• Introduction

• Data

• Models

• Quiz: Models

• Loss Function

• Quiz: Loss function

• Learning Algorithm

• Quiz: Learning Algorithm

• Evaluation

• Feedback: Expert Systems, 6 Jars

• 6

Vectors and Matrices

• Introduction to Vectors

• Dot product of vectors

• Unit Vectors

• Projection of one vector onto another

• Angle between two vectors

• Why do we care about vectors ?

• Introduction to Matrices

• Multiplying a vector by a matrix

• Multiplying a matrix by another matrix

• An alternate way of multiplying two matrices

• Why do we care about matrices ?

• Quiz: Vectors and Matrices

• Feedback: Vectors and Matrices

• 7

Python More Basics + Linear Algebra

• Google Drive and Colab Integration

• Pandas

• Python Debugger

• Plotting Vectors

• Vector Dot Product

• Feedback: Python more basics + Linear Algebra

• 8

MP Neuron

• Errata

• Six Jars Summary - Part 1

• Six Jars Summary - Part 2

• Introduction

• MP Neuron Model

• MP Neuron Loss

• MP Neuron Learning

• MP Neuron Evaluation

• MP Neuron Geometry Basics

• MP Neuron Geometric Interpretation

• Summary

• Feedback: MP Neuron

• 9

Perceptron

• Introduction

• Perceptron Model

• Geometric Interpretation

• Perceptron Loss Function

• Perceptron Learning - General Recipe

• Perceptron Learning Algorithm

• Perceptron Learning - Why It Works?

• Perceptron Learning - Will It Always Work?

• Perceptron Evaluation

• Summary

• Feedback: Perceptron

• 10

Python: MP Neuron, Perceptron, Test/Train

• IMPORTANT: Erratum

• Perceptron: Toy Example

• Train-Test Split

• Binarisation

• Inference And Search

• Inference

• Class

• Perceptron Class

• Epochs

• Checkpointing

• Learning Rate

• Weight Animation

• Excercises

• Feedback: MP neuron, Perceptron, Test/Train

• Account Details

• 11

Contests

• Contests intro

• Creating a Kaggle account

• Data preprocessing

• Submitting Entries

• Clarifications

• 12

Contest 1.1: Mobile phone like/dislike predictor

• Sample solution file

• 13

• Recap from last lecture

• Revisiting limitations of perceptron model

• Sigmoid Model Part 1

• Sigmoid Model Part 2

• Sigmoid Model Part 3

• Sigmoid Model Part 4

• Quiz: Sigmoid - Model

• Quiz: Sigmoid - Data and Tasks

• Sigmoid: Loss Function

• Quiz: Sigmoid - Loss function

• Learning: Intro to Learning Algorithm

• Learning: Learning by guessing

• Learning: Error Surfaces for learning

• Learning: Mathematical setup for the learning algorithm

• Learning: The math-free version of learning algorithm

• Learning: Introducing Taylor Series

• Learning: More intuitions about Taylor series

• Quiz: Sigmoid - Taylor Series

• Learning: Deriving the Gradient Descent Update rule

• Learning: The complete learning algorithm

• Learning: Computing Partial Derivatives

• Quiz: Sigmoid Learning algorithm

• Learning: Writing the code

• Sigmoid: Dealing with more than 2 parameters

• Sigmoid: Evaluation

• Quiz: Sigmoid - Evaluation

• Summary and take-aways

• Feedback: Sigmoid Neuron

• 14

• Plotting Sigmoid 2D

• Plotting Sigmoid 3D

• Plotting Loss

• Contour Plot

• Class

• Toy Data Fit

• Toy Data Plot: 1/2

• Toy Data Plot: 2/2

• Feedback: Python - Sigmoid, Gradient Descent

• 15

• Standardisation

• Test/Train Split (1/2)

• Test/Train Split (2/2)

• Fitting Data

• Loss Plot

• Progress Bar

• Exercises

• Feedback: Python: Sigmoid, Gradient Descent (contd)

• 16

Basic: Probability Theory

• Introduction

• Random Variable: Intuition

• Random Variable: Formal Definition

• Random Variable: Continuous and Discrete

• Probability Distribution

• True and Predicted Distribution

• Certain Events

• Why Do we Care About Distributions

• Feedback: Probability Theory

• 17

Information Theory

• Expectation

• Quiz: Expectation

• Information Content

• Quiz: Information Content

• Entropy

• Quiz: Entropy

• Relation To Number Of Bits

• Quiz: Number of bits

• KL-Divergence and Cross Entropy

• Quiz: KL Divergence

• Feedback: Information Theory

• 18

Sigmoid Neuron and Cross Entropy

• Sigmoid Neuron and Cross Entropy

• Using Cross Entropy With Sigmoid Neuron

• Learning Algorithm for Cross Entropy loss function

• Quiz: Loss function

• Computing partial derivatives with cross entropy loss

• Code for Cross Entropy Loss function

• Feedback

• 19

Contest 1.1 discussion

• Analysis of the solutions

• 20

Contest 1.2: Binary Text/NoText Classification

• Boilerplate code

• Explanation video on both contests

• Explanation on Contest 1.2

• 21

Contest 1.3 (Advanced): Binary Text/NoText Classification

• Explanation video on Contest 1.3

• Boilerplate code

• 22

Representation Power of Functions

• Why do we need complex functions

• Complex functions in real world examples

• A simple recipe for building complex functions

• Illustrative Proof of Universal Approximation Theorem

• Summary

• Feedback: Representation Power

• 23

Feedforward Neural Networks

• Setting the context

• Model: A simple deep neural network

• Model: A generic deep neural network

• Quiz: A generic deep neural network

• Model: Understanding the computations in a deep neural network

• Model: The output layer of a deep neural network

• Model: Output layer of a multi-class classification problem

• Quiz: Output layer for multi-class classification

• Model: How do you choose the right network configuration

• Loss function for binary classification

• Loss function for multi-class classification

• Quiz: Loss function for multi-class classification

• Learning Algorithm (non-mathy version)

• Evaluation

• Summary

• Feedback: Feedforward Neural Networks

• 24

Python: Feed Forward Networks

• Outline

• Generating Data

• Classification with Sigmoid Neuron

• Classification with FF Network

• Generic Class of FF Neuron

• Multi Class Classification with FF Network

• Exercise

• Feedback: Python Feed Forward Network

• 25

Backpropagation (light math)

• Setting the context

• Revisiting Basic Calculus

• Why do we care about the chain rule of derivatives

• Quiz: Derivatives and functions

• Applying chain rule across multiple paths

• Applying Chain rule in a neural network

• Computing Partial Derivatives w.r.t. a weight - Part 1

• Computing Partial Derivatives w.r.t. a weight - Part 2

• Computing Partial Derivatives w.r.t. a weight - Part 3

• Computing Partial Derivatives w.r.t. a weight when there are multiple paths

• Quiz: Derivatives across multiple paths

• Takeaways and what next ?

• Feedback: Backpropagation (light math)

• 26

Python: Scalar Backpropagation

• Outline

• Single Weight Update

• Single Weight Training

• Multiple Weight Update

• Visualising Outputs

• Visualising Weights

• Backpropagation for Multiple Class Classification

• Shortened Backpropagation Code

• Exercises

• Feedback: Python Scalar Backpropagation

• 27

Backpropagation (vectorized)

• Backpropagation (vectorized)

• Errata from last theory slot

• Setting the Context

• Intuition behind backpropagation

• Understanding the dimensions of gradients

• Computing Derivatives w.r.t. Output Layer - Part 1

• Computing Derivatives w.r.t. Output Layer - Part 2

• Computing Derivatives w.r.t. Output Layer - Part 3

• Quick recap of the story so far

• Computing Derivatives w.r.t. Hidden Layers - Part 1

• Computing Derivatives w.r.t. Hidden Layers - Part 2

• Computing Derivatives w.r.t. Hidden Layers - Part 3

• Computing derivatives w.r.t. one weight in any layer

• Computing derivatives w.r.t. all weights in any layer

• A running example of backpropagation

• Summary

• Feedback: Backpropagation (vectorized)

• 28

Python: Vectorised Feed Forward Networks

• Outline

• Benefits of Vectorisation

• Scalar Class - Recap

• Vectorising weights

• Vectorising inputs and weights

• Evaluation of Classes

• Exercises

• Feedback: Python Vectorized FFNs

• 29

Optimization Algorithms (Part 1)

• A quick history of DL to set the context

• Highlighting a limitation of Gradient Descent

• A deeper look into the limitation of gradient descent

• Introducing contour maps

• Exercise: Guess the 3D surface

• Visualizing gradient descent on a 2D contour map

• Intuition for momentum based gradient descent

• Dissecting the update rule for momentum based gradient descent

• Running and visualizing momentum based gradient descent

• Intuition behind nesterov accelerated gradient descent

• Running and visualizing nesterov accelerated gradient descent

• Summary and what next

• Feedback: Optimization Algorithms (Part 1)

• 30

Optimization Algorithms (Part 2)

• The idea of stochastic and mini-batch gradient descent

• Epochs and Steps

• Why do we need an adaptive learning rate ?

• Running and visualizing RMSProp

• Summary

• Feedback: Optimization Algorithms (Part 2)

• 31

Python: Optimization Algorithms

• Outline

• Modified Sigmoid Neuron Class

• Setup for Plotting

• GD Algorithm - Contour Plot

• Momentum

• Nesterov Accelerated GD

• Mini-Batch GD

• Feedback: Python Optimisation Algorithms

• 32

Python: Optimization Algorithms 2

• RMSProp

• Vectorised Class Recap

• Vectorised GD Algorithms

• Performance of Different Algorithms

• Good solutions and Exercise

• Feedback: Python Optimization Algorithms 2

• 33

• 34

Activation Functions and Initialization Methods

• Setting the context

• Saturation in logistic neuron

• Zero centered functions

• Introducing Tanh and Relu activation functions

• Tanh and ReLU Activation Functions

• Symmetry Breaking Problem

• Xavier and He initialization

• Summary and what next

• Feedback: Activation functions and weight initialization

• 35

Python: Activation Functions and Initialisation Methods

• Introduction and Activation Functions

• Activation Functions

• Plotting Setup

• Sigmoid

• Tanh

• ReLu

• Leaky ReLu

• Exercises

• Feedback: Python Activation Functions and Initialisation

• 36

Regularization Methods

• Simple v/s complex models

• Analysing the behavior of simple and complex models

• Bias and Variance

• Test error due to high bias and high variance

• Overfitting in deep neural networks

• A detour into hyperparameter tuning

• L2 regularization

• Dataset Augmentation and Early Stopping

• Summary

• Feedback: Regularization

• 37

Python: Overfitting and Regularisation

• Outline and Libraries

• L2 Regularisation in Code

• Bias on Increasing Model Complexity

• L2 Regularisation in Action

• Adding Noise to Input Features

• Early Stopping and Exercises

• Feedback: Python Overfitting and Regularisation

• 38

Python: PyTorch Intro

• Outline

• PyTorch Tensors

• Simple Tensor Operations

• NumPy vs PyTorch

• GPU PyTorch

• Automatic Differentiation

• Loss Function with AutoDiff

• Learning Loop GPU

• Feedback: PyTorch Intro

• 39

PyTorch: Feed Forward Networks

• Outline

• Forward Pass With Tensors

• Functions for Loss, Accuracy, Backpropagation

• PyTorch Modules: NN and Optim

• NN Sequential and Code Structure

• GPU Execution

• Exercises and Recap

• Feedback: PyTorch Feedforward Networks

• 40

The convolution operation

• Setting the Context

• The 1D convolution operation

• The 2D Convolution Operation

• Examples of 2D convolution

• 2D convolution with a 3D filter

• Terminilogy

• Feedback: Convolution Operation

• 41

Convolutional Neural Networks

• How is the convolution operation related to Neural Networks - Part 1

• How is the convolution operation related to Neural Networks - Part 2

• How is the convolution operation related to Neural Networks - Part 3

• Understanding the input/output dimensions

• Sparse Connectivity and Weight Sharing

• Max Pooling and Non-Linearities

• Our First Convolutional Neural Network (CNN)

• Training CNNs

• Summary and what next

• Feedback: CNNs

• 42

PyTorch: CNN

• Outline

• Visualising Weights

• Single Convolutional Layer

• Deep CNNs

• LeNet

• Training Le Net

• Visualising Intermediate Layers, Exercises

• Feedback: PyTorch CNN

• 43

CNN architectures

• Setting the context

• The Imagenet Challenge

• Understanding the first layer of AlexNet

• Understanding all layers of AlexNet

• ZFNet

• VGGNet

• Summary

• Feedback: CNN architectures

• 44

CNN Architectures (Part 2)

• Setting the context

• Number of computations in a convolution layer

• 1x1 Convolutions

• The Inception Module

• Average Pooling

• Auxiliary Loss for training a deep network

• ResNet

• Feedback: CNN architectures (Part 2)

• 45

Python: CNN Architectures

• Outline

• Image Transforms

• VGG

• Training VGG

• Pre-trained Models

• Checkpointing Models

• ResNet

• Inception Part 1

• Inception Part 2

• Exercises

• Feedback: Python CNN Architectures

• 46

Visualising CNNs

• Receptive field of a neuron

• Identifying images which cause certain neurons to fire

• Visualising filters

• Occlusion experiments

• Feedback: Visualizing CNNs

• 47

Python: Visualising CNNs

• Outline

• Custom Torchvision Dataset

• Visualising inputs

• Occlusion

• Visualising filters

• Visualising filters - code

• Feedback: Python Visualising CNNs

• 48

Batch Normalization and Dropout

• Normalizing inputs

• Why should we normalize the inputs

• Batch Normalization

• Learning Mu and Sigma

• Ensemble Methods

• The idea of dropout

• Training without dropout

• How does weight sharing help ?

• Using dropout at test time

• How does dropout act as a regularizer ?

• Summary and what next ?

• Feedback: Batch Normalization and Dropout

• 49

Pytorch: BatchNorm and Dropout

• Batch Norm Layer

• Outline and Dataset

• Batch Norm Visualisation

• Batch Norm 2d

• Dropout layer

• Dropout Visualisation and Exercises

• Feedback: Pytorch Batch Norm and Dropout

• 50

Hyperparameter Tuning and MLFlow

• Outline

• Colab on Local Runtime

• MLFlow installation and basic usage

• Hyperparamater Tuning

• Refined Search for Hyperparameters

• Logging Image Artifacts

• One Last Visualisation

• Feedback: Hyperparameter tuning and MLflow

• 51

Practice problem: CNN and FNN

• Details of problem

• 52

Sequence Learning Problems

• Setting the context

• Introduction to sequence learning problems

• Some more examples of sequence learning problems

• Sequence learning problems using video and speech data

• A wishlist for modelling sequence learning problems

• Intuition behind RNNs - Part 1

• Intuition behind RNNs - Part 2

• Introducing RNNs

• Summary and what next

• Feedback: Sequence Learning Problems

• 53

Recurrent Neural Networks

• Setting the context

• Data and Tasks - Sequence Classification - Part 1

• Data and Tasks - Sequence Classification - Part 2

• Data and Tasks - Sequence Labelling

• Model

• Loss Function

• Learning Algorithm

• Learning Algorithm - Derivatives w.r.t. V

• Learning Algorithm - Derivatives w.r.t. W

• Evaluation

• Summary and what next

• Feedback: Recurrent Neural Networks

• 54

• Revisiting the gradient wrt W

• Zooming into one element of the chain rule - Part 1

• Zooming into one element of the chain rule - Part 2

• A small detour to calculus

• Looking at the magnitude of the derivative

• Summary and what next

• Feedback: Vanishing and exploding gradients

• 55

LSTMs and GRUs

• Dealing with longer sequences

• The white board analogy

• Real world example of longer sequences

• Going back to RNNs

• Selective Write - Part 1

• Selective Write - Part 2

• Selective forget

• An example computation with LSTMs

• Gated recurrent units

• Summary and what next

• Feedback: LSTMs and GRUs

• 56

Sequence Models in PyTorch

• Outline

• RNN Model

• Inference on RNN

• Training RNN

• Training Setup

• LSTM

• GRU and Exercises

• Feedback: Sequence Models in PyTorch

• 57

Vanishing and Exploding gradients and LSTMs

• Apology

• Quick Recap

• Intuition: How gates help to solve the problem of vanishing gradients

• Revisiting vanishing gradients in RNNs

• Dependency diagram for LSTMs

• When do the gradients vanish?

• Summary and what next

• Feedback: LSTMs and vanishing and exploding gradients

• 58

Encoder Decoder Models

• Setting the context

• Revisiting the task of language modelling

• Using RNNs for language modelling

• Introducing Encoder Decoder Model

• Connecting encoder decoder models to the six jars

• A compact notation for RNNs, LSTMs and GRUs

• Encoder decoder model for image captioning

• Six jars for image captioning

• Encoder decoder for Machine translation

• Encoder decoder model for transliteration

• Summary

• Feedback: Encoder decoder models

• 59

Attention Mechanism

• Motivation for attention mechanism

• Attention mechanism with an oracle

• A model for attention

• The attention function

• Machine translation with attention

• Summary and what next

• Feedback: Attention Mechanism

• 60

Batching for Sequence Models in PyTorch

• Overview

• Recap on Sequence Models

• Batching for Sequence Models

• Packing in PyTorch

• Training with Batched Input

• Feedback: Batching for Sequence Models in PyTorch

• 61

Object Detection

• Setting the context

• A typical pipeline for object detection

• More clarity on regression

• RCNN - Region Proposal

• RCNN - Feature Extraction

• RCNN - Classification

• RCNN - Regression

• RCNN- Training

• Introduction to YOLO

• The Output of YOLO

• Training YOLO

• Summary and what next

• Feedback: Object Detection

• 62

Encoder and Decoder Models, Attention in Pytorch

• Outline

• Data Ingestion - XML processing

• Encoder Decoder Model - 1

• Encoder Decoder Model - 2

• Model Evaluation and Exercises

• Feedback: Encoder Decoder Models, Attention in Pytorch

• 63

Capstone Project

• Dataset for capstone project

• Project Details

• End of course Feedback

• 64

Thank You

• Thank You

This course is part of the three-course PadhAI One series

• Foundations of Data science

Open for registrations
5 months
Rs 1,000 for students

• Machine Learning

Coming soon!
5 months
Rs 1,000 for students

• Deep Learning

Open to registrations
5 months
Rs 1,000 for students

This course

Testimonials

Student Testimonials from this Deep Learning course Best Course for Hands-On and Theory

Praveen R

This course is the best as it focuses both on the theory and hands on. This course introduced me to Kaggle competitions and I got addicted to it. I feel more confident that I can contribute to real world projects involving deep learning after taking this course. Perfectly Balanced Course

Tulasi Ram Laghumavarapu

Never seen a course like this which has a perfect balance between theory and hands-on. Effective and Efficient

Jonath S

Padhai is one among the best place to learn. The courses offered by Padhai is cheap, very effective and worth spending time on. I would love to take up more courses in Padhai. Beginner Friendly

Vudata Rohit

Learned a lot from this course. Before starting this course, I have no knowledge of deep learning but after learning this course I am pretty confident. Highly Recommend!

Deepak Kumar G S

Highly recommended for deep learning beginners and pros alike. Don't Miss this Course!

Srivathsan Vijayaraghavan

I would suggest this course to all of my colleagues and students. Transformative Learning Experience

Shashank Mishra

There was paradigm shift in knowledge of deep Learning after pursuing this course. Beautifully Taught

Deepak Choudhary

It is the best hands-on practical Deep learning course taught in a beautiful way. Excellent Focus on Fundamentals

Ari Ezhilarasan

Focussing more on fundamentals is what makes this course so special. I could still imagine how I transitioned from an MP neuron to Encoder-Decoder architecture. Hats off to both the professors.

FAQ

• What is the time commitment required?

Each week, there will be 3-4 hours of video content. We recommend 2 to 3 hours of self-learning and practice. Thus, a weekly commitment of 4 to 6 hours is required. The duration of the course is for 18 to 20 weeks.
However, in case you are unable to find this time due to other commitments, you can do the course at your own pace and complete it within any time within one year.

• Will I get a certificate at the end of the course?

Yes, if you complete the entire course and finish the assignments, you will receive a certificate from One Fourth Labs. This is digitally signed and can be shared on LinkedIn and other websites.
Each course in the PadhAI One series will have a separate certificate.

You will have access to the course content (videos, assignments, community) for 1 year from the start of the course.

• I am interested in Machine Learning, should I still do the Deep Learning course?

The Deep Learning course introduces the basics of maths & python and then proceeds to cover all the mentioned topics in syllabus. These fundamentals are required for many job roles in AI.
Also, in the machine learning course, we will assume a background in these areas. If you are confident about the topics enlisted in the syllabus, then you can directly join the Machine Learning course that begins later this year.

• Will I get computational resources for doing my projects?

No, we do not provide any computational resources. The course platform only hosts the video lectures and assignments. All programming assignments and projects will be done on Google Colaboratory, which is a freely available resource. In the course, we provide a tutorial on how to use Google Colaboratory. It is therefore sufficient to have a standard computer and a good internet connection. You can also make use of Kaggle kernels if Google Colab is not sufficient for your purposes.

• How can I clear my doubts? 