Your instructors

Mitesh Khapra and Pratyush Kumar

Mitesh and Pratyush are Assistant Professors at the Department of Computer Science and Engineering at IIT Madras. They have both industry and academic experience in working with deep learning and related areas. They are both passionate about teaching and contributing to nation building.
Mitesh Khapra and Pratyush Kumar

Data Science Fee Structure

Driven by our passion for teaching and interest in nation building, 
all PadhAI One courses are offered at very affordable prices.


For students/faculty For professionals
Students enrolled in schools/colleges and faculty members Working professionals and those looking to up-skill
Applicants must provide a valid ID card indicating present affiliation. No pre-requisites
Rs 1,000 + 18% GST for each course Rs 5,000 + 18% GST for each course


Course curriculum

  • 1

    Welcome

    • Hello!

  • 2

    Intro: About the course

    • Overview of the course

    • FAQ on the course

  • 3

    Account Details

    • Account Details

  • 4

    KickOff

    • KickOff

    • Tentative course schedule

  • 5

    Python Basics

    • Python Basics: About this chapter

    • Basic: Google Colaboratory

    • Basic: Basic Data Types

    • Basic: List

    • Basic: Tuple, Set, Dictionary

    • Basic: Packages

    • Basic: File Handling

    • Basic: Class

    • Basic: Numpy

    • Basic: Plotting

    • Download materials

    • Feedback: Python Basics

  • 6

    Expert Systems - 6 Jars

    • Expert Systems

    • Say Hi To ML

    • Introduction

    • Data

    • Tasks

    • Quiz: Data + Tasks

    • Models

    • Quiz: Models

    • Loss Function

    • Quiz: Loss function

    • Learning Algorithm

    • Quiz: Learning Algorithm

    • Evaluation

    • Feedback: Expert Systems, 6 Jars

  • 7

    Vectors and Matrices

    • Introduction to Vectors

    • Dot product of vectors

    • Unit Vectors

    • Projection of one vector onto another

    • Angle between two vectors

    • Why do we care about vectors ?

    • Introduction to Matrices

    • Multiplying a vector by a matrix

    • Multiplying a matrix by another matrix

    • An alternate way of multiplying two matrices

    • Why do we care about matrices ?

    • Quiz: Vectors and Matrices

    • Feedback: Vectors and Matrices

  • 8

    Python More Basics + Linear Algebra

    • Google Drive and Colab Integration

    • Pandas

    • Python Debugger

    • Plotting Vectors

    • Vector Addition and Subtraction

    • Vector Dot Product

    • Download Materials

    • Feedback: Python more basics + Linear Algebra

  • 9

    MP Neuron

    • Errata

    • Six Jars Summary - Part 1

    • Six Jars Summary - Part 2

    • Introduction

    • MP Neuron Model

    • MP Neuron Data Task

    • MP Neuron Loss

    • MP Neuron Learning

    • MP Neuron Evaluation

    • MP Neuron Geometry Basics

    • MP Neuron Geometric Interpretation

    • Summary

    • Feedback: MP Neuron

  • 10

    Perceptron

    • Introduction

    • Perceptron Data Task

    • Perceptron Model

    • Geometric Interpretation

    • Perceptron Loss Function

    • Perceptron Learning - General Recipe

    • Perceptron Learning Algorithm

    • Perceptron Learning - Why It Works?

    • Perceptron Learning - Will It Always Work?

    • Perceptron Evaluation

    • Summary

    • Feedback: Perceptron

  • 11

    Python: MP Neuron, Perceptron, Test/Train

    • IMPORTANT: Erratum

    • Perceptron: Toy Example

    • Loading Data

    • Train-Test Split

    • Binarisation

    • Inference And Search

    • Inference

    • Class

    • Perceptron Class

    • Epochs

    • Checkpointing

    • Learning Rate

    • Weight Animation

    • Excercises

    • Download Material

    • Feedback: MP neuron, Perceptron, Test/Train

    • Account Details

  • 12

    Contests

    • Contests intro

    • Creating a Kaggle account

    • Data preprocessing

    • Submitting Entries

    • Clarifications

  • 13

    Contest 1.1: Mobile phone like/dislike predictor

    • Contest Links

    • Sample solution file

  • 14

    Sigmoid Neuron, Gradient Descent

    • Recap from last lecture

    • Revisiting limitations of perceptron model

    • Sigmoid Model Part 1

    • Sigmoid Model Part 2

    • Sigmoid Model Part 3

    • Sigmoid Model Part 4

    • Quiz: Sigmoid - Model

    • Sigmoid: Data and Tasks

    • Quiz: Sigmoid - Data and Tasks

    • Sigmoid: Loss Function

    • Quiz: Sigmoid - Loss function

    • Learning: Intro to Learning Algorithm

    • Learning: Learning by guessing

    • Learning: Error Surfaces for learning

    • Learning: Mathematical setup for the learning algorithm

    • Learning: The math-free version of learning algorithm

    • Learning: Introducing Taylor Series

    • Learning: More intuitions about Taylor series

    • Quiz: Sigmoid - Taylor Series

    • Learning: Deriving the Gradient Descent Update rule

    • Learning: The complete learning algorithm

    • Learning: Computing Partial Derivatives

    • Quiz: Sigmoid Learning algorithm

    • Learning: Writing the code

    • Sigmoid: Dealing with more than 2 parameters

    • Sigmoid: Evaluation

    • Quiz: Sigmoid - Evaluation

    • Summary and take-aways

    • Feedback: Sigmoid Neuron

  • 15

    Python: Sigmoid, Gradient Descent

    • Plotting Sigmoid 2D

    • Plotting Sigmoid 3D

    • Plotting Loss

    • Contour Plot

    • Class

    • Toy Data Fit

    • Toy Data Plot: 1/2

    • Toy Data Plot: 2/2

    • Download Materials

    • Feedback: Python - Sigmoid, Gradient Descent

  • 16

    Python: Sigmoid, Gradient Descent (contd)

    • Loading Data

    • Standardisation

    • Test/Train Split (1/2)

    • Test/Train Split (2/2)

    • Fitting Data

    • Loss Plot

    • Progress Bar

    • Exercises

    • Download Materials

    • Feedback: Python: Sigmoid, Gradient Descent (contd)

  • 17

    Basic: Probability Theory

    • Introduction

    • Random Variable: Intuition

    • Random Variable: Formal Definition

    • Random Variable: Continuous and Discrete

    • Probability Distribution

    • True and Predicted Distribution

    • Certain Events

    • Why Do we Care About Distributions

    • Feedback: Probability Theory

  • 18

    Information Theory

    • Expectation

    • Quiz: Expectation

    • Information Content

    • Quiz: Information Content

    • Entropy

    • Quiz: Entropy

    • Relation To Number Of Bits

    • Quiz: Number of bits

    • KL-Divergence and Cross Entropy

    • Quiz: KL Divergence

    • Feedback: Information Theory

  • 19

    Sigmoid Neuron and Cross Entropy

    • Sigmoid Neuron and Cross Entropy

    • Using Cross Entropy With Sigmoid Neuron

    • Learning Algorithm for Cross Entropy loss function

    • Quiz: Loss function

    • Computing partial derivatives with cross entropy loss

    • Code for Cross Entropy Loss function

    • Feedback

  • 20

    Contest 1.1 discussion

    • Analysis of the solutions

  • 21

    Contest 1.2: Binary Text/NoText Classification

    • Kaggle Contest Links

    • Boilerplate code

    • Explanation video on both contests

    • Explanation on Contest 1.2

  • 22

    Contest 1.3 (Advanced): Binary Text/NoText Classification

    • Kaggle Contest Links

    • Explanation video on Contest 1.3

    • Boilerplate code

  • 23

    Representation Power of Functions

    • Why do we need complex functions

    • Complex functions in real world examples

    • A simple recipe for building complex functions

    • Illustrative Proof of Universal Approximation Theorem

    • Summary

    • Feedback: Representation Power

  • 24

    Feedforward Neural Networks

    • Setting the context

    • Data and Tasks

    • Model: A simple deep neural network

    • Model: A generic deep neural network

    • Quiz: A generic deep neural network

    • Model: Understanding the computations in a deep neural network

    • Model: The output layer of a deep neural network

    • Model: Output layer of a multi-class classification problem

    • Quiz: Output layer for multi-class classification

    • Model: How do you choose the right network configuration

    • Loss function for binary classification

    • Loss function for multi-class classification

    • Quiz: Loss function for multi-class classification

    • Learning Algorithm (non-mathy version)

    • Evaluation

    • Summary

    • Feedback: Feedforward Neural Networks

  • 25

    Python: Feed Forward Networks

    • Outline

    • Generating Data

    • Classification with Sigmoid Neuron

    • Classification with FF Network

    • Generic Class of FF Neuron

    • Multi Class Classification with FF Network

    • Exercise

    • Download Materials

    • Feedback: Python Feed Forward Network

  • 26

    Backpropagation (light math)

    • Setting the context

    • Revisiting Basic Calculus

    • Why do we care about the chain rule of derivatives

    • Quiz: Derivatives and functions

    • Applying chain rule across multiple paths

    • Applying Chain rule in a neural network

    • Computing Partial Derivatives w.r.t. a weight - Part 1

    • Computing Partial Derivatives w.r.t. a weight - Part 2

    • Computing Partial Derivatives w.r.t. a weight - Part 3

    • Computing Partial Derivatives w.r.t. a weight when there are multiple paths

    • Quiz: Derivatives across multiple paths

    • Takeaways and what next ?

    • Feedback: Backpropagation (light math)

  • 27

    Python: Scalar Backpropagation

    • Outline

    • Single Weight Update

    • Single Weight Training

    • Multiple Weight Update

    • Visualising Outputs

    • Visualising Weights

    • Backpropagation for Multiple Class Classification

    • Shortened Backpropagation Code

    • Exercises

    • Download Materials

    • Feedback: Python Scalar Backpropagation

  • 28

    Backpropagation (vectorized)

    • Backpropagation (vectorized)

    • Errata from last theory slot

    • Setting the Context

    • Intuition behind backpropagation

    • Understanding the dimensions of gradients

    • Computing Derivatives w.r.t. Output Layer - Part 1

    • Computing Derivatives w.r.t. Output Layer - Part 2

    • Computing Derivatives w.r.t. Output Layer - Part 3

    • Quick recap of the story so far

    • Computing Derivatives w.r.t. Hidden Layers - Part 1

    • Computing Derivatives w.r.t. Hidden Layers - Part 2

    • Computing Derivatives w.r.t. Hidden Layers - Part 3

    • Computing derivatives w.r.t. one weight in any layer

    • Computing derivatives w.r.t. all weights in any layer

    • A running example of backpropagation

    • Summary

    • Feedback: Backpropagation (vectorized)

  • 29

    Python: Vectorised Feed Forward Networks

    • Outline

    • Benefits of Vectorisation

    • Scalar Class - Recap

    • Vectorising weights

    • Vectorising inputs and weights

    • Evaluation of Classes

    • Exercises

    • Feedback: Python Vectorized FFNs

    • Download Materials

  • 30

    Optimization Algorithms (Part 1)

    • A quick history of DL to set the context

    • Highlighting a limitation of Gradient Descent

    • A deeper look into the limitation of gradient descent

    • Introducing contour maps

    • Exercise: Guess the 3D surface

    • Visualizing gradient descent on a 2D contour map

    • Intuition for momentum based gradient descent

    • Dissecting the update rule for momentum based gradient descent

    • Running and visualizing momentum based gradient descent

    • A disadvantage of momentum based gradient descent

    • Intuition behind nesterov accelerated gradient descent

    • Running and visualizing nesterov accelerated gradient descent

    • Summary and what next

    • Feedback: Optimization Algorithms (Part 1)

  • 31

    Optimization Algorithms (Part 2)

    • The idea of stochastic and mini-batch gradient descent

    • Running stochastic gradient descent

    • Running mini-batch gradient descent

    • Epochs and Steps

    • Why do we need an adaptive learning rate ?

    • Introducing Adagrad

    • Running and Visualizing Adagrad

    • A limitation of Adagrad

    • Running and visualizing RMSProp

    • Running and visualizing Adam

    • Summary

    • Feedback: Optimization Algorithms (Part 2)

  • 32

    Python: Optimization Algorithms

    • Outline

    • Modified Sigmoid Neuron Class

    • Setup for Plotting

    • Gradient Descent Algorithm

    • GD Algorithm - Contour Plot

    • Momentum

    • Nesterov Accelerated GD

    • Mini-Batch GD

    • Download Materials

    • Feedback: Python Optimisation Algorithms

  • 33

    Python: Optimization Algorithms 2

    • AdaGrad

    • RMSProp

    • Adam

    • Vectorised Class Recap

    • Vectorised GD Algorithms

    • Performance of Different Algorithms

    • Good solutions and Exercise

    • Download Materials

    • Feedback: Python Optimization Algorithms 2

  • 34

    Contest 1.3 (Advanced): analysis

    • Contest 1.3(advanced): analysis

  • 35

    Activation Functions and Initialization Methods

    • Setting the context

    • Saturation in logistic neuron

    • Zero centered functions

    • Introducing Tanh and Relu activation functions

    • Tanh and ReLU Activation Functions

    • Symmetry Breaking Problem

    • Xavier and He initialization

    • Summary and what next

    • Feedback: Activation functions and weight initialization

  • 36

    Python: Activation Functions and Initialisation Methods

    • Introduction and Activation Functions

    • Activation Functions

    • Plotting Setup

    • Sigmoid

    • Tanh

    • ReLu

    • Leaky ReLu

    • Exercises

    • Download Materials

    • Feedback: Python Activation Functions and Initialisation

  • 37

    Regularization Methods

    • Simple v/s complex models

    • Analysing the behavior of simple and complex models

    • Bias and Variance

    • Test error due to high bias and high variance

    • Overfitting in deep neural networks

    • A detour into hyperparameter tuning

    • L2 regularization

    • Dataset Augmentation and Early Stopping

    • Summary

    • Feedback: Regularization

  • 38

    Python: Overfitting and Regularisation

    • Outline and Libraries

    • L2 Regularisation in Code

    • Bias on Increasing Model Complexity

    • L2 Regularisation in Action

    • Adding Noise to Input Features

    • Early Stopping and Exercises

    • Download Materials

    • Feedback: Python Overfitting and Regularisation

  • 39

    Python: PyTorch Intro

    • Outline

    • PyTorch Tensors

    • Simple Tensor Operations

    • NumPy vs PyTorch

    • GPU PyTorch

    • Automatic Differentiation

    • Loss Function with AutoDiff

    • Learning Loop GPU

    • Download Materials

    • Feedback: PyTorch Intro

  • 40

    PyTorch: Feed Forward Networks

    • Outline

    • Forward Pass With Tensors

    • Functions for Loss, Accuracy, Backpropagation

    • PyTorch Modules: NN and Optim

    • NN Sequential and Code Structure

    • GPU Execution

    • Exercises and Recap

    • Download Materials

    • Feedback: PyTorch Feedforward Networks

  • 41

    The convolution operation

    • Setting the Context

    • The 1D convolution operation

    • The 2D Convolution Operation

    • Examples of 2D convolution

    • 2D convolution with a 3D filter

    • Terminilogy

    • Padding and Stride

    • Feedback: Convolution Operation

  • 42

    Convolutional Neural Networks

    • How is the convolution operation related to Neural Networks - Part 1

    • How is the convolution operation related to Neural Networks - Part 2

    • How is the convolution operation related to Neural Networks - Part 3

    • Understanding the input/output dimensions

    • Sparse Connectivity and Weight Sharing

    • Max Pooling and Non-Linearities

    • Our First Convolutional Neural Network (CNN)

    • Training CNNs

    • Summary and what next

    • Feedback: CNNs

  • 43

    PyTorch: CNN

    • Outline

    • Loading Data Sets

    • Visualising Weights

    • Single Convolutional Layer

    • Deep CNNs

    • LeNet

    • Training Le Net

    • Visualising Intermediate Layers, Exercises

    • Download Materials

    • Feedback: PyTorch CNN

  • 44

    CNN architectures

    • Setting the context

    • The Imagenet Challenge

    • Understanding the first layer of AlexNet

    • Understanding all layers of AlexNet

    • ZFNet

    • VGGNet

    • Summary

    • Feedback: CNN architectures

  • 45

    CNN Architectures (Part 2)

    • Setting the context

    • Number of computations in a convolution layer

    • 1x1 Convolutions

    • The Intuition behind GoogLeNet

    • The Inception Module

    • The GoogleNet Architecture

    • Average Pooling

    • Auxiliary Loss for training a deep network

    • ResNet

    • Feedback: CNN architectures (Part 2)

  • 46

    Python: CNN Architectures

    • Outline

    • Image Transforms

    • VGG

    • Training VGG

    • Pre-trained Models

    • Checkpointing Models

    • ResNet

    • Inception Part 1

    • Inception Part 2

    • Exercises

    • Download materials

    • Feedback: Python CNN Architectures

  • 47

    Visualising CNNs

    • Receptive field of a neuron

    • Identifying images which cause certain neurons to fire

    • Visualising filters

    • Occlusion experiments

    • Feedback: Visualizing CNNs

  • 48

    Python: Visualising CNNs

    • Outline

    • Custom Torchvision Dataset

    • Visualising inputs

    • Occlusion

    • Visualising filters

    • Visualising filters - code

    • Download materials

    • Feedback: Python Visualising CNNs

  • 49

    Batch Normalization and Dropout

    • Normalizing inputs

    • Why should we normalize the inputs

    • Batch Normalization

    • Learning Mu and Sigma

    • Ensemble Methods

    • The idea of dropout

    • Training without dropout

    • How does weight sharing help ?

    • Using dropout at test time

    • How does dropout act as a regularizer ?

    • Summary and what next ?

    • Feedback: Batch Normalization and Dropout

  • 50

    Pytorch: BatchNorm and Dropout

    • Batch Norm Layer

    • Outline and Dataset

    • Batch Norm Visualisation

    • Batch Norm 2d

    • Dropout layer

    • Dropout Visualisation and Exercises

    • Download Materials

    • Feedback: Pytorch Batch Norm and Dropout

  • 51

    Hyperparameter Tuning and MLFlow

    • Outline

    • Colab on Local Runtime

    • MLFlow installation and basic usage

    • Hyperparamater Tuning

    • Refined Search for Hyperparameters

    • Logging Image Artifacts

    • Logging and Loading Models

    • One Last Visualisation

    • Download Materials

    • Feedback: Hyperparameter tuning and MLflow

  • 52

    Practice problem: CNN and FNN

    • Details of problem

  • 53

    Sequence Learning Problems

    • Setting the context

    • Introduction to sequence learning problems

    • Some more examples of sequence learning problems

    • Sequence learning problems using video and speech data

    • A wishlist for modelling sequence learning problems

    • Intuition behind RNNs - Part 1

    • Intuition behind RNNs - Part 2

    • Introducing RNNs

    • Summary and what next

    • Feedback: Sequence Learning Problems

  • 54

    Recurrent Neural Networks

    • Setting the context

    • Data and Tasks - Sequence Classification - Part 1

    • Data and Tasks - Sequence Classification - Part 2

    • A clarification about padding

    • Data and Tasks - Sequence Labelling

    • Model

    • Loss Function

    • Learning Algorithm

    • Learning Algorithm - Derivatives w.r.t. V

    • Learning Algorithm - Derivatives w.r.t. W

    • Evaluation

    • Summary and what next

    • Feedback: Recurrent Neural Networks

  • 55

    Vanishing and exploding gradients

    • Revisiting the gradient wrt W

    • Zooming into one element of the chain rule - Part 1

    • Zooming into one element of the chain rule - Part 2

    • A small detour to calculus

    • Looking at the magnitude of the derivative

    • Exploding and vanishing gradients

    • Summary and what next

    • Feedback: Vanishing and exploding gradients

  • 56

    LSTMs and GRUs

    • Dealing with longer sequences

    • The white board analogy

    • Real world example of longer sequences

    • Going back to RNNs

    • Selective Write - Part 1

    • Selective Write - Part 2

    • Selective Read

    • Selective forget

    • An example computation with LSTMs

    • Gated recurrent units

    • Summary and what next

    • Feedback: LSTMs and GRUs

  • 57

    Sequence Models in PyTorch

    • Outline

    • Dataset and Task

    • RNN Model

    • Inference on RNN

    • Training RNN

    • Training Setup

    • LSTM

    • GRU and Exercises

    • Download Materials

    • Feedback: Sequence Models in PyTorch

  • 58

    Vanishing and Exploding gradients and LSTMs

    • Apology

    • Quick Recap

    • Intuition: How gates help to solve the problem of vanishing gradients

    • Revisiting vanishing gradients in RNNs

    • Dependency diagram for LSTMs

    • Computing the gradient

    • When do the gradients vanish?

    • Dealing with exploding gradients

    • Summary and what next

    • Feedback: LSTMs and vanishing and exploding gradients

  • 59

    Encoder Decoder Models

    • Setting the context

    • Revisiting the task of language modelling

    • Using RNNs for language modelling

    • Introducing Encoder Decoder Model

    • Connecting encoder decoder models to the six jars

    • A compact notation for RNNs, LSTMs and GRUs

    • Encoder decoder model for image captioning

    • Six jars for image captioning

    • Encoder decoder for Machine translation

    • Encoder decoder model for transliteration

    • Summary

    • Feedback: Encoder decoder models

  • 60

    Attention Mechanism

    • Motivation for attention mechanism

    • Attention mechanism with an oracle

    • A model for attention

    • The attention function

    • Machine translation with attention

    • Summary and what next

    • Feedback: Attention Mechanism

  • 61

    Batching for Sequence Models in PyTorch

    • Overview

    • Recap on Sequence Models

    • Batching for Sequence Models

    • Padding Vector Representations

    • Packing in PyTorch

    • Training with Batched Input

    • Download Materials

    • Feedback: Batching for Sequence Models in PyTorch

  • 62

    Object Detection

    • Setting the context

    • A typical pipeline for object detection

    • More clarity on regression

    • RCNN - Region Proposal

    • RCNN - Feature Extraction

    • RCNN - Classification

    • RCNN - Regression

    • RCNN- Training

    • Introduction to YOLO

    • The Output of YOLO

    • Training YOLO

    • Summary and what next

    • Feedback: Object Detection

  • 63

    Encoder and Decoder Models, Attention in Pytorch

    • Outline

    • Data set and Task

    • Data Ingestion - XML processing

    • Encoder Decoder Model - 1

    • Encoder Decoder Model - 2

    • Adding Attention - 1

    • Adding Attention - 2

    • Model Evaluation and Exercises

    • Download Materials

    • Feedback: Encoder Decoder Models, Attention in Pytorch

  • 64

    Capstone Project

    • Dataset for capstone project

    • Project Details

    • End of course Feedback

  • 65

    Thank You

    • Thank You