Why Learn Data Science?

By reading this, you've already taken your first steps on the path the becoming a data scientist. Here are a few reasons to stick along!

  • Data Science is fast becoming one of most sought after professions in India and around the world.

  • More than 1.5 Lakh job openings for Data Scientists projected in 2020, increasing by 62% from 2019.

  • Data is everywhere, it is a universal currency. Learning how to gain insights from data is an invaluable skill to have.

What is Data Science?

Data is the new oil and Data Science is its combustion engine! While there are many definitions as to what data science really is, we have found it best to describe it as a field revolving around 5 data-related operations.

  • Collection

    Data Collection is the process of gathering data (Numerical, text, video, audio etc), influenced by two major factors namely, the question that needs to be answered by the data scientist and the environment that the data scientist is working in!

  • Storing

    Storing data involves maintaining the collected data for use during the data science pipeline. Structured data is typically stored in relational-databases and aggregated in data-warehouses. With the advent of Big-Data, Data Lakes are now used to store multimodal structured and unstructured data.

  • Processing

    Data Processing is a set of 3 main sub-processes. Data Wrangling (Extraction, transformation, and loading of the data), Data Cleaning (Handling Missing Values, Outliers, etc) and Data Scaling, Normalization and Standardization.

  • Describing

    Data Description has two aspects. Data Visualising involves representing processed data using graphs, charts, diagrams, and other visualizations. Data Summarisation involves calculating various summary statistics like the mean, median, mode, standard deviation, and variance.

  • Modelling

    Statistical Modelling of data involves modelling the underlying data distribution and relations in the data and then making inferences on top of the model. Algorithmic modelling involves using large volumes of data and optimization techniques to best estimate the distribution and relations of the data, eg Machine Learning and Deep Learning.

Is this the right data science course for me?

If you are familiar with programming (in any language) and comfortable with mathematics at 12th standard (high school) level, then you should be able to follow along with this course. This course is well suited for the following learning objectives:

  • Understand the value of data science and the process behind using it.

  • Learn the fundamentals of statistics and probability required for data science.

  • Use Python to gather, store, clean, analyse, and visualise data-sets.

  • Apply statistical methods to formulate and test data hypotheses

  • Apply statistical inference to uncover relationships within data-sets

  • Understand the role of ML and DL in the data science pipeline

  • Understand real-world challenges with several case studies

Your instructors

Mitesh Khapra and Pratyush Kumar

Mitesh and Pratyush are Assistant Professors at the Department of Computer Science and Engineering at IIT Madras. They have both industry and academic experience in working with deep learning and related areas. They are both passionate about teaching and contributing to nation building.
Mitesh Khapra and Pratyush Kumar

Data Science Course curriculum

  • 1

    Instructions for this course

    • Hello!

    • Information Questionnaire

  • 2

    Week 1: Introduction

    • Introduction

    • What is Data Science?

    • Collecting Data

    • Storing Data

    • Processing Data

    • Describing Data

    • Statistical Modelling

    • Algorithmic Modelling

    • Why is Data Science so popular today

    • Are AI and Data Science related?

    • Problem Solving

    • Knowledge Representation & Reasoning

    • Decision Making

    • Communication, Perception & Actuation

    • The Myths of Data Science

    • The Path to Data Science

    • Feedback: Introduction

    • Week-1 Quiz Test (Graded & compulsory)

    • Week-1 Quiz Explained (optional)

  • 3

    Week 2 - Part 1: Engineering Data Science Systems

    • Engineering Aspects of Data Science

    • System Perspective of Data Science

    • CRISP - DM_Business Understanding

    • CRISP - DM_Data Understanding, Preparation & Modelling

    • CRISP - DM_Evaluation & Deployment

    • Programming Tools

    • Why Python?

    • Python - Libraries

    • Summary

    • Feedback: Engineering Data Science Systems

    • Week 2 Part-1 Quiz Test ( Graded & Compulsory)

    • Week 2 Part-1 Quiz Explained (Optional)

  • 4

    Week 2 - Part 2 : What is Statistics?

    • Introduction to Statistics

    • What is Statistics

    • How to select a Sample

    • How to Design an Experiment

    • How to Describe & Summarise Data

    • Why do we need Probability Theory?

    • How do we give guarantees for estimates made from sample

    • What is a hypothesis & How do we test it?

    • How to model relationship between variables?

    • How well does the model fit the data?

    • Summary

    • Feedback: What is Statistics

    • Week-2 Part-2 Quiz Test (Graded & compulsory)

    • Week-2 Part-2 Quiz Explained (Optional)

  • 5

    Week 3: Getting started with Python

    • Getting started with Python

    • Google Colab

    • Printing & Basic Data Types

    • Variables

    • Integers, Floating Points, Boolean types & Input

    • Processing Strings, Integers & Floating Points

    • If, For, While Blocks

    • Functions

    • Download: Week 3 Course NoteBook

    • Assignment Problems

    • Week 3 Assignment 1 Questions

    • Week 3 Quiz Test (Graded & Compulsory)

    • Week 3 Quiz Explained (Optional)

    • Solution to Assignment Problem 1 - Part 1

    • Solution to Assignment Problem 1 - Part 2

    • Solution to Assignment Problem 2 - Part 1

    • Solution to Assignment Problem 2 - Part 2

    • Download: Week 3 Assignment Solutions

    • Feedback: Getting started with Python

  • 6

    Blog Contest - 1

    • Bog Contest 1 - Winners

  • 7

    Week 4: Descriptive Statistics (Part 1)

    • Introduction to Descriptive Statistics

    • Different types of Data

    • How to describe Qualitative Data?

    • Course Insights

    • How to describe Quantative Data? Histograms

    • Histograms Continued...

    • Typical Trends in Histograms

    • Uses of Histograms in ML

    • Stem and Leaf Plots

    • How to describe relationship between variables? Scatter Plots

    • Uses of Scatter Plots in ML

    • Summary

    • Feedback: Descriptive Statistics Part 1

    • Week 4 Quiz test (Graded & Compulsory)

    • Week 4 Quiz Explained (optional)

  • 8

    Week 5: Python (continued)

    • Commenting and Error Handling

    • Lists

    • Lists - Continued

    • Solution - Exercise problem on Lists

    • Tuples & Sets

    • Dictionaries

    • (Solution - Exercise problem on Dictionaries ) & (Exercise problem in Design Thinking)

    • Solution - Exercise problem on Design Thinking

    • File Handling - Read

    • File Handling - Write

    • Solution Parts 1, 2 (Exercise on most common words)

    • Solution Part 3 (Exercise on most common 2-grams)

    • Feedback: Python (contd)

    • Week 5 Quiz test (Graded & Compulsory)

    • Week 5 Quiz Explained (Optional)

    • Download: Week 5 course notebook, sample text file

    • Week 5 Assignment - Download (Compulsory)

    • Week 5 Assignment Solutions - Download (Optional)

    • Python Data Objects Reference NoteBook

  • 9

    Week 6: Descriptive Statistics (Part 2)

    • Introduction - Measures of Centrality and Spread

    • Different measures of Centrality

    • Characteristics of Measures of Centrality

    • Sensitivity of the Measures of Centrality to Outliers

    • What do the measures of Centrality look like for different types of distributions?

    • Compute median from a Histogram

    • Compute mean from a Histogram

    • Compute Mode from a Histograms

    • Effect of Transformations on the measures of centrality

    • Summary

    • Feedback: Descriptive Statistics Part 2

    • Week 6 Quiz Test (Graded & Compulsory)

    • Week 6 Quiz Explained (optioinal)

  • 10

    Week 7: Descriptive Statistics (Part 3)

    • Introduction to Measures of Spread - Percentiles

    • Procedure for Computing Percentile

    • Alternative methods for Computing Percentile - Part - 1

    • Alternative methods for Computing Percentile - Part - 2

    • Frequently used Percentile

    • Compute the Percentile rank of a value in the data

    • Effect of Transformation on Percentiles

    • Summary Percentiles

    • Measures of Spread

    • Measures of Spread (Variance)

    • Why we square the Deviations ?

    • What does the variance tell us about the data ?

    • Effect of Transformations on Measures of spread

    • How do you use mean & Variance to Standardise data ?

    • Summary Measures of Spread

    • What are Box Plots ?

    • Feedback: Descriptive Statistics Part 3

    • Week 7 Quiz Test (Graded & Compulsory)

    • Week 7 Quiz Explained (Optional)

  • 11

    Week 8: Numpy

    • Python Data Containers - Reference

    • W8 Data Files - Download

    • NumPy

    • High Dimensional Array & Creating NumPy Array

    • Indexing

    • Numpy Operations

    • Problem Solution

    • Broadcasting

    • File handling

    • Stats with Numpy

    • Rules of Statistics

    • Case Study & Problems

    • Problem Solution Part 1

    • Problem Solution Part 2

    • Problem Solution Part 3

    • Lecture Notebooks - Download

    • Feedback: Numpy

    • Numpy - Additional Exercises

    • Week-8 Quiz test (Graded & Compulsory)

    • Week-8 Quiz Explained (Optional)

  • 12

    Week 9: Pandas

    • W9 Data File - Download

    • Introduction - Pandas

    • Creating Series Object

    • iLoc & Loc

    • Simple Operations

    • Solution - Task 1

    • Solution - Task 2 & 3

    • NIFTY case study

    • Case Study Solution

    • W9 Lecture Notebook

    • Week -9 Quiz Test (Compulsory)

    • Week -9 Quiz Explained (Optional)

    • Feedback: Pandas

  • 13

    Week 10: Pandas (continued)

    • Dataframe Object

    • Task on creating Dataframes

    • Creating Mean row

    • Working with Planetary dataset

    • Droping Null Values

    • Querying from dataframe

    • Applying functions to dataframes

    • Use of groupby method

    • Filter, Split, Apply, Aggregate

    • Working with Nifty50 Dataset

    • Nifty data - Download

    • Tasks on NIFTY datasets

    • W10 - Pandas (continued) Notebook

    • Feedback: Pandas (continued)

    • Week 10 Quiz Test (Compulsory)

    • Week 10 Quiz Explained (Optional)

  • 14

    Week 11: Visualisation

    • Data Visualisation

    • Read Complex JSON files

    • Styling Tabulation

    • Distribution of Data - Histogram

    • Box Plot

    • Distribution of a categorical variable

    • Joint Distribution of two variables

    • Swarm Plot

    • Violin Plot

    • Multiple Violin Plots

    • Paired Violin Plot

    • Faceted plotting

    • Pair Plot

    • Boxen Plots

    • Feedback: Visualization

    • Week 11 - Quiz Test (Compulsory)

    • Week 11 - Quiz Explained (optional)

    • W11 - Visualisation Notebook

  • 15

    Week 12: Visualisation (Continued)

    • Data Visualization - Recap

    • Pie Chart

    • Donut Chart

    • Stacked Bar Plot

    • Relative Stacked Bar Plot

    • Time - Varying compostion of data

    • Stacked Area Plot

    • Scatter Plots

    • Bar Plot

    • Continuous vs Continuous Plot

    • Line Plot

    • Line Plot Covid Data

    • Heat Map

    • Summary & Task on open-ended visualisation

    • W12 - Visualisation (cont.) Notebook

    • Feedback: Visualization (cont.)

    • Week 12 Quiz Test (Compulsory)

    • Week 12 Quiz Explained (optional)

  • 16

    Week 13: Approaching Open ended DS problems

    • Pandas Recap

    • Handling missing data

    • Missing data with Pandas

    • Open ended descriptive statistics

    • Agriculture Example Part 1

    • Agriculture Example Part 2

    • Week 13 Lecture NB

    • Feedback: Approching Open Ended DS Problem

    • Week 13 Quiz Test (Compulsory)

    • Week 13 Quiz Explained (Optional)

  • 17

    Week 14: Counting

    • Why do we need Counting and Probability Theory?

    • Very Simple Counting

    • The Multiplication Principle

    • Multiplication Principle Special Case: Sequences with Repetition

    • Multiplication Principle Special Case: Sequences without Repetition

    • Example: A Different Kind of Sequence

    • Multiplication Principle Special Case: Sequence Length Equals the Number of Objects

    • The Subraction Principle

    • Collections

    • Collections (Some Examples)

    • Collections with Repetitions

    • Collections (+ multiplication principle)

    • Collections (+ subraction principle)

    • Summary

    • Week 14 Quiz Test (Compulsory)

    • Week 14 Quiz Explained (Optional)

    • Feedback: Counting

  • 18

    Week 15: Sample spaces & Events

    • Introduction

    • The Element of Chance (Nothing in life is certain)

    • A brief overview of Set Theory

    • Properties of Set Operations

    • Experiments & Sample spaces

    • Events of an Experiment

    • Axioms of Probability

    • Some properties of Probability

    • Example problems (Probability Theory)

    • Designing Probablity functions (as relative frequency)

    • Designing Probablity functions (equally likely outcomes)

    • Summary - 1

    • Conditional Probabilities

    • Examples (Conditional Probabilities)

    • The Multiplication Principle

    • Total Probability Theorem

    • Bayes' Theorem

    • Independent Events

    • Summary - 2

    • Week 15 Quiz Test (Compulsory)

    • Week 15 Quiz Explained (Optional)

    • Feedback: Sample spaces & Events

  • 19

    Week 16: Random Variables

    • Introduction

    • Random Variable

    • Probability Mass Functions

    • Properties of PMF

    • Disctrete distributions

    • Bernoulli Distribution

    • Binomial Distribution

    • Example (Binomial Distribution)

    • More Examples (Binomial Distribution)

    • Is Binomial Distribution a valid distribution ?

    • Geometric Distribution

    • Is Geometric distribution a valid distribution ?

    • Uniform Distribution

    • Expectation

    • Examples - Expectation

    • Properties of Expectation

    • Function of a Random Variable

    • Variance of a Random Variable

    • Properties of Variance

    • Summary

    • Week 16 Quiz Test (Compulsory)

    • Week 16 Quiz Explained (Optional)

    • Feedback: Random Variables

  • 20

    Week 17: Distributions & Sampling Strategies

    • Introduction

    • Continuous Random Variable

    • Intution : Density vs Mass

    • Uniform Distribution (Continuous)

    • Some Fun with Functions

    • Normal Distributions

    • Probability Density Function

    • Standard Normal Distribution

    • Sampling Methods

    • Experimental Studies

    • Week 17 Quiz Test (Compulsory)

    • Week 17 Quiz Explained (Optional)

    • Feedback: Distributions & Sampling Strategies

  • 21

    Week 18: Distributions of Sample Statistics

    • Introduction - Inferential Statistics

    • Distribution of Sample Statistics

    • Parameter

    • Sample

    • Why do we Compute Statistics ?

    • Estimate Population Parameters

    • Random Sample

    • Recap : Probability

    • Probability Space

    • What kind of random variables ?

    • What is inferential statistics?

    • Our Roadmap

    • Demo 01

    • Demo 02

    • Demo Problems

    • Exercise - Part 1

    • Exercise - Part 2

    • Week 18 Quiz Test (Compulsory)

    • Week 18 Quiz Explained (Optional)

    • Feedback: Distributions of Sample Statistics

  • 22

    Week 19: Central Limit Theorem

    • Central Limit Theorem

    • Demo 01

    • Alternative version of CLT

    • CLT - Attempt at Proof

    • Implications of CLT

    • Computing area under N

    • Demo 02

    • Special Significance for N

    • Likelihood of sample mean

    • Super-Impose N

    • Approximating Distributions

    • Demo 03

    • Normal Approximation of Binomial Distribution

    • Week 19 Quiz Test (Compulsory)

    • Week 19 Quiz Explained (Optional)

    • Feedback: Central Limit Theorem

  • 23

    Week 20: Chi Square Distribution

    • Chi Square Distribution

    • Estimating E[S2]

    • Estimating E[S2] - Exercise

    • Geometric arguement

    • Algebraic arguement

    • Find Expected value of the error

    • Estimating Var[S2]

    • Distribution of sum of squares of standard normal variables

    • Distribution for N>1

    • k degrees of freedom

    • Variance of X2(k)

    • Recap & Statistics of S2

    • On to Experiments

    • Expectation of Proportion

    • Variance of Proportion

    • Week 20 Quiz Test (Compulsory)

    • Week 20 Quiz Explained (Optional)

    • Feedback: Chi-square Distribution

  • 24

    Week 21 : Point and Interval Estimators

    • Point and Interval Estimators

    • Examples to Solve

    • What are the Estimator

    • Properties of Estimator

    • Point Estimator for Mean & Proportion

    • Point Estimator for Sample Variance

    • Example Estimation with TimeSeries

    • Real World Problem

    • On to Interval Estimators

    • Interval Estimator of μ with known σ

    • Examples of Estimator

    • Examples of Estimation

    • Lower and Upper Bounds

    • Upper Confidence Bound

    • Interval Estimator of μ with unknown σ

    • T Distribution Plots

    • Comparing interval bounds with z- and t- variables

    • Examples with T Statistics

    • Computing interval bounds for population proportion p

    • Week 21 Quiz Test (Compulsory)

    • Week 21 Quiz Explained (Optional)

    • Feedback: Point and Interval Estimators

  • 25

    Week 22: Hypothesis Testing

    • Hypothesis Testing Case Study - 1

    • Case Study 2

    • Case Study 3 & 4

    • Case Study 5 & 6

    • Three Cases

    • Variance: Known - Case Study 1

    • Variance: Known - Case Study 2

    • Effect of n, σ, and α

    • Variance: Known - Case Study 3 & 4

    • Variance: Known - Case Study 5 & 6

    • z-test vs t-test

    • Variance: Unknown - Case Study 1 & 2

    • Hypothesis testing proportion(p) instead of mean

    • Type 1 & Type 2 errors

    • Two tailed & One tailed z- test

    • Two tailed & One tailed t- test

    • Plotting Distribution

    • Chi-Square test of independence (case studies)

    • Chi-Square test of independence (case study -2)

    • Summary

    • Week 22 Lecture Notebook

    • Week 22 Quiz Test (Compulsory)

    • Week 22 Quiz Explained (Optional)

    • Feedback: Week 22

    • End of course Feedback

    • Obtaining Certificate

Data Science Fee Structure

Driven by our passion for teaching and interest in nation building, 
all PadhAI One courses are offered at very affordable prices.


For students/faculty For professionals
Students enrolled in schools/colleges and faculty members Working professionals and those looking to up-skill
Applicants must provide a valid ID card indicating present affiliation. No pre-requisites
Rs 1,000 + 18% GST for each course Rs 5,000 + 18% GST for each course


The PadhAI One Series

This course is part of the three-course PadhAI One series

  • Foundations of Data science

    Open for registrations
    5 months
    Rs 1,000 for students

    This course

  • Machine Learning

    Coming soon!
    5 months
    Rs 1,000 for students

  • Deep Learning

    Open to registrations
    5 months
    Rs 1,000 for students

    See syllabus | Join now

Testimonials

Student Testimonials from our previous offering: PadhAI Deep Learning

Best Course for Hands-On and Theory

Praveen R

This course is the best as it focuses both on the theory and hands on.This course introduced me to Kaggle competitions and I got addicted to it. I feel more confident that I can contribute to real world projects involving deep learning after taking this course.

Perfectly Balanced Course

Tulasi Ram Laghumavarapu

Never seen a course like this which has a perfect balance between theory and hands-on.

Effective and Efficient

Jonath S

Padhai is one among the best place to learn. The courses offered by Padhai is cheap, very effective and worth spending time on. I would love to take up more courses in Padhai.

Beginner Friendly

Vudata Rohit

Learned a lot from this course. Before starting this course, I have no knowledge of deep learning but after learning this course I am pretty confident.

Highly Recommend!

Deepak Kumar G S

Highly recommended for deep learning beginners and pros alike.

Don't Miss this Course!

Srivathsan Vijayaraghavan

I would suggest this course to all of my colleagues and students.

Transformative Learning Experience

Shashank Mishra

There was paradigm shift in knowledge of deep Learning after pursuing this course.

Beautifully Taught

Deepak Choudhary

It is the best hands-on practical Deep learning course taught in a beautiful way.

Excellent Focus on Fundamentals

Ari Ezhilarasan

Focussing more on fundamentals is what makes this course so special. I could still imagine how I transitioned from an MP neuron to Encoder-Decoder architecture. Hats off to both the professors.

FAQ

  • What is the time commitment required?

    Each week, we will release 2 to 3 hours of video content. We recommend 2 to 3 hours of self-learning and practice. Thus, a weekly commitment of 4 to 6 hours is required. The duration of the course is for 18 to 20 weeks.
    However, in case you are unable to find this time due to other commitments, you can do the course at your own pace and complete it within any time within one year.

  • Will I get a certificate at the end of the course?

    Yes, if you complete the entire course and finish the assignments, you will receive a certificate from One Fourth Labs. This is digitally signed and can be shared on LinkedIn and other websites.
    Each course in the PadhAI One Data Science series will have a separate certificate.

  • How long will I have access to the course?

    You will have access to the course content (videos, assignments, community) for 1 year from the start of the course.

  • I am interested in Machine Learning, should I still do the Foundations of Data Science course?

    The Foundations in Data Science course focuses on the basics of statistics and Python programming for data science. These fundamentals are required for many job roles.
    Also, in the machine learning course, we will assume a background in these areas. If you are confident about the topics enlisted in the syllabus, then you can directly join the Machine Learning course that begins later this year.

  • Will i get computational resources for doing my projects?

    No, we do not provide any computational resources. The course platform only hosts the video lectures and assignments. All programming assignments and projects will be done on Google Colaboratory, which is a freely available resource. In the course, we provide a tutorial on how to use Google Colaboratory. It is therefore sufficient to have a standard computer and a good internet connection.

  • How can I clear my doubts?

    You will have access to the PadhAI course community where you can post your queries. Dedicated TAs will answer them. You are also encouraged to interact with your peers and learn together.

  • Do you provide any support with finding jobs?

    While data science is a highly sought after job role, we do not provide any placement guarantee or support.

Watch Intro Video

PadhAI One FAQs

Sample Certificate

This is a sample of the course completion certificate from our Deep Learning Course. You will receive a similar certificate upon completion of this course titled PadhAI One: Foundations of Data Science.
Sample certificate