Why Learn Data Science?

By reading this, you've already taken your first steps on the path the becoming a data scientist. Here are a few reasons to stick along!

  • Data Science is fast becoming one of most sought after professions in India and around the world.

  • More than 1.5 Lakh job openings for Data Scientists projected in 2020, increasing by 62% from 2019.

  • Data is everywhere, it is a universal currency. Learning how to gain insights from data is an invaluable skill to have.

What is Data Science?

Data is the new oil and Data Science is its combustion engine! While there are many definitions as to what data science really is, we have found it best to describe it as a field revolving around 5 data-related operations.

  • Collection

    Data Collection is the process of gathering data (Numerical, text, video, audio etc), influenced by two major factors namely, the question that needs to be answered by the data scientist and the environment that the data scientist is working in!

  • Storing

    Storing data involves maintaining the collected data for use during the data science pipeline. Structured data is typically stored in relational-databases and aggregated in data-warehouses. With the advent of Big-Data, Data Lakes are now used to store multimodal structured and unstructured data.

  • Processing

    Data Processing is a set of 3 main sub-processes. Data Wrangling (Extraction, transformation, and loading of the data), Data Cleaning (Handling Missing Values, Outliers, etc) and Data Scaling, Normalization and Standardization.

  • Describing

    Data Description has two aspects. Data Visualising involves representing processed data using graphs, charts, diagrams, and other visualizations. Data Summarisation involves calculating various summary statistics like the mean, median, mode, standard deviation, and variance.

  • Modelling

    Statistical Modelling of data involves modelling the underlying data distribution and relations in the data and then making inferences on top of the model. Algorithmic modelling involves using large volumes of data and optimization techniques to best estimate the distribution and relations of the data, eg Machine Learning and Deep Learning.

Is this the right data science course for me?

If you are familiar with programming (in any language) and comfortable with mathematics at 12th standard (high school) level, then you should be able to follow along with this course. This course is well suited for the following learning objectives:

  • Understand the value of data science and the process behind using it.

  • Learn the fundamentals of statistics and probability required for data science.

  • Use Python to gather, store, clean, analyse, and visualise data-sets.

  • Apply statistical methods to formulate and test data hypotheses

  • Apply statistical inference to uncover relationships within data-sets

  • Understand the role of ML and DL in the data science pipeline

  • Understand real-world challenges with several case studies

Your instructors

Mitesh Khapra and Pratyush Kumar

Mitesh and Pratyush are Assistant Professors at the Department of Computer Science and Engineering at IIT Madras. They have both industry and academic experience in working with deep learning and related areas. They are both passionate about teaching and contributing to nation building.
Mitesh Khapra and Pratyush Kumar

Data Science Course curriculum

  • 1

    Instructions for this course

    • Hello!

    • Course Discussion Forum

    • Information Questionnaire

  • 2

    Week 1: Introduction

    • Introduction

    • What is Data Science?

    • Collecting Data

    • Storing Data

    • Processing Data

    • Describing Data

    • Statistical Modelling

    • Algorithmic Modelling

    • Why is Data Science so popular today

    • Are AI and Data Science related?

    • Problem Solving

    • Knowledge Representation & Reasoning

    • Decision Making

    • Communication, Perception & Actuation

    • The Myths of Data Science

    • The Path to Data Science

    • Feedback: Introduction

    • Week-1 Quiz Test (Graded & compulsory)

    • Week-1 Quiz Explained (optional)

  • 3

    Week 2 - Part 1: Engineering Data Science Systems

    • Engineering Aspects of Data Science

    • System Perspective of Data Science

    • CRISP - DM_Business Understanding

    • CRISP - DM_Data Understanding, Preparation & Modelling

    • CRISP - DM_Evaluation & Deployment

    • Programming Tools

    • Why Python?

    • Python - Libraries

    • Summary

    • Feedback: Engineering Data Science Systems

    • Week 2 Part-1 Quiz Test ( Graded & Compulsory)

    • Week 2 Part-1 Quiz Explained (Optional)

  • 4

    Week 2 - Part 2 : What is Statistics?

    • Introduction to Statistics

    • What is Statistics

    • How to select a Sample

    • How to Design an Experiment

    • How to Describe & Summarise Data

    • Why do we need Probability Theory?

    • How do we give guarantees for estimates made from sample

    • What is a hypothesis & How do we test it?

    • How to model relationship between variables?

    • How well does the model fit the data?

    • Summary

    • Feedback: What is Statistics

    • Week-2 Part-2 Quiz Test (Graded & compulsory)

    • Week-2 Part-2 Quiz Explained (Optional)

  • 5

    Week 3: Getting started with Python

    • Join the course forum and discuss

    • Getting started with Python

    • Google Colab

    • Printing & Basic Data Types

    • Variables

    • Integers, Floating Points, Boolean types & Input

    • Processing Strings, Integers & Floating Points

    • If, For, While Blocks

    • Functions

    • Download: Week 3 Course NoteBook

    • Assignment Problems

    • Week 3 Assignment 1 Questions

    • Week 3 Quiz Test (Graded & Compulsory)

    • Week 3 Quiz Explained (Optional)

    • Solution to Assignment Problem 1 - Part 1

    • Solution to Assignment Problem 1 - Part 2

    • Solution to Assignment Problem 2 - Part 1

    • Solution to Assignment Problem 2 - Part 2

    • Download: Week 3 Assignment Solutions

    • Feedback: Getting started with Python

  • 6

    Blog Contest - 1

    • Bog Contest 1 - Winners

  • 7

    Week 4: Descriptive Statistics (Part 1)

    • Introduction to Descriptive Statistics

    • Different types of Data

    • How to describe Qualitative Data?

    • Course Insights

    • How to describe Quantative Data? Histograms

    • Histograms Continued...

    • Typical Trends in Histograms

    • Uses of Histograms in ML

    • Stem and Leaf Plots

    • How to describe relationship between variables? Scatter Plots

    • Uses of Scatter Plots in ML

    • Summary

    • Feedback: Descriptive Statistics Part 1

    • Week 4 Quiz test (Graded & Compulsory)

    • Week 4 Quiz Explained (optional)

  • 8

    Week 5: Python (continued)

    • Commenting and Error Handling

    • Lists

    • Lists - Continued

    • Solution - Exercise problem on Lists

    • Tuples & Sets

    • Dictionaries

    • (Solution - Exercise problem on Dictionaries ) & (Exercise problem in Design Thinking)

    • Solution - Exercise problem on Design Thinking

    • File Handling - Read

    • File Handling - Write

    • Solution Parts 1, 2 (Exercise on most common words)

    • Solution Part 3 (Exercise on most common 2-grams)

    • Feedback: Python (contd)

    • Week 5 Quiz test (Graded & Compulsory)

    • Week 5 Quiz Explained (Optional)

    • Download: Week 5 course notebook, sample text file

    • Week 5 Assignment - Download (Compulsory)

    • Week 5 Assignment Solutions - Download (Optional)

    • Python Data Objects Reference NoteBook

  • 9

    Week 6: Descriptive Statistics (Part 2)

    • Introduction - Measures of Centrality and Spread

    • Different measures of Centrality

    • Characteristics of Measures of Centrality

    • Sensitivity of the Measures of Centrality to Outliers

    • What do the measures of Centrality look like for different types of distributions?

    • Compute median from a Histogram

    • Compute mean from a Histogram

    • Compute Mode from a Histograms

    • Effect of Transformations on the measures of centrality

    • Summary

    • Feedback: Descriptive Statistics Part 2

    • Week 6 Quiz Test (Graded & Compulsory)

    • Week 6 Quiz Explained (optioinal)

  • 10

    Week 7: Descriptive Statistics (Part 3)

    • Introduction to Measures of Spread - Percentiles

    • Procedure for Computing Percentile

    • Alternative methods for Computing Percentile - Part - 1

    • Alternative methods for Computing Percentile - Part - 2

    • Frequently used Percentile

    • Compute the Percentile rank of a value in the data

    • Effect of Transformation on Percentiles

    • Summary Percentiles

    • Measures of Spread

    • Measures of Spread (Variance)

    • Why we square the Deviations ?

    • What does the variance tell us about the data ?

    • Effect of Transformations on Measures of spread

    • How do you use mean & Variance to Standardise data ?

    • Summary Measures of Spread

    • What are Box Plots ?

    • Feedback: Descriptive Statistics Part 3

    • Week 7 Quiz Test (Graded & Compulsory)

    • Week 7 Quiz Explained (Optional)

  • 11

    Week 8: Numpy

    • Python Data Containers - Reference

    • W8 Data Files - Download

    • NumPy

    • High Dimensional Array & Creating NumPy Array

    • Indexing

    • Numpy Operations

    • Problem Solution

    • Broadcasting

    • File handling

    • Stats with Numpy

    • Rules of Statistics

    • Case Study & Problems

    • Problem Solution Part 1

    • Problem Solution Part 2

    • Problem Solution Part 3

    • Lecture Notebooks - Download

    • Feedback: Numpy

    • Numpy - Additional Exercises

    • Week-8 Quiz test (Graded & Compulsory)

    • Week-8 Quiz Explained (Optional)

  • 12

    Break due to Covid19 lockdown

    • Instructions

    • Form for accessing the Deep Learning Course (2 month validity)

  • 13

    Week 9: Probability Distributions

    • Bernoulli, Binomial, and Poisson distributions

    • Continuous random variable

    • Uniform and normal distributions

  • 14

    Week 10: Data visualisation (continued)

    • Data visualisation with Seaborn

    • Simulating probabilistic events

  • 15

    Week 11: Break

    • Break

  • 16

    Week 12: Sampling and Sampling Statistics

    • Sampling strategies

    • Distribution of sampling statistics (mean, variance, proportion)

    • Central Limit Theorem

  • 17

    Week 13: Sampling with Python

    • Sampling strategies with Python

    • Demonstration of central limit theorem

    • Practice case study

  • 18

    Week 14: Interval Estimation

    • Interval estimation for mean (variance known)

    • Interval estimation for mean (variance unknown)

    • Demonstration in Python

  • 19

    Week 15: Hypothesis testing

    • Anatomy of Hypothesis Testing

    • Type I and Type II Errors

    • Single sample mean with known variance

    • Single sample mean with unknown variance

    • Demonstration in Python

  • 20

    Week 16: Hypothesis testing (continued)

    • Single sample variance

    • Single sample proportion

    • Demonstration in Python

  • 21

    Week 17: Hypothesis testing (continued)

    • Two population mean known variance

    • Two population mean, known variance, small sample

    • Two population mean, known variance, large sample

    • Paired t-test

    • Two population, proportion

    • Demonstration in Python

  • 22

    Week 18: Analysis of variance

    • One factor analysis

    • Two factor analysis

    • Demonstration in Python

  • 23

    Week 19: Linear Regression

    • Model

    • Estimating parameters

    • Measuring goodness of fit

Data Science Fee Structure

Driven by our passion for teaching and interest in nation building, 
all PadhAI One courses are offered at very affordable prices.


For students/faculty For professionals
Students enrolled in schools/colleges and faculty members Working professionals and those looking to up-skill
Applicants must provide a valid ID card indicating present affiliation. No pre-requisites
Rs 1,000 + 18% GST for each course Rs 5,000 + 18% GST for each course


The PadhAI One Series

This course is part of the three-course PadhAI One series

  • Foundations of Data science

    Open for registrations
    5 months
    Rs 1,000 for students

    This course

  • Machine Learning

    Coming soon!
    4 months
    Rs 1,000 for students

  • Deep Learning

    Coming soon!
    4 months
    Rs 1,000 for students

Testimonials

Student Testimonials from our previous offering: PadhAI Deep Learning

Best Course for Hands-On and Theory

Praveen R

This course is the best as it focuses both on the theory and hands on.This course introduced me to Kaggle competitions and I got addicted to it. I feel more confident that I can contribute to real world projects involving deep learning after taking this course.

Perfectly Balanced Course

Tulasi Ram Laghumavarapu

Never seen a course like this which has a perfect balance between theory and hands-on.

Effective and Efficient

Jonath S

Padhai is one among the best place to learn. The courses offered by Padhai is cheap, very effective and worth spending time on. I would love to take up more courses in Padhai.

Beginner Friendly

Vudata Rohit

Learned a lot from this course. Before starting this course, I have no knowledge of deep learning but after learning this course I am pretty confident.

Highly Recommend!

Deepak Kumar G S

Highly recommended for deep learning beginners and pros alike.

Don't Miss this Course!

Srivathsan Vijayaraghavan

I would suggest this course to all of my colleagues and students.

Transformative Learning Experience

Shashank Mishra

There was paradigm shift in knowledge of deep Learning after pursuing this course.

Beautifully Taught

Deepak Choudhary

It is the best hands-on practical Deep learning course taught in a beautiful way.

Excellent Focus on Fundamentals

Ari Ezhilarasan

Focussing more on fundamentals is what makes this course so special. I could still imagine how I transitioned from an MP neuron to Encoder-Decoder architecture. Hats off to both the professors.

FAQ

  • What is the time commitment required?

    Each week, we will release 2 to 3 hours of video content. We recommend 2 to 3 hours of self-learning and practice. Thus, a weekly commitment of 4 to 6 hours is required. The duration of the course is for 18 to 20 weeks.
    However, in case you are unable to find this time due to other commitments, you can do the course at your own pace and complete it within any time within one year.

  • Will I get a certificate at the end of the course?

    Yes, if you complete the entire course and finish the assignments, you will receive a certificate from One Fourth Labs. This is digitally signed and can be shared on LinkedIn and other websites.
    Each course in the PadhAI One Data Science series will have a separate certificate.

  • How long will I have access to the course?

    You will have access to the course content (videos, assignments, community) for 1 year from the start of the course.

  • I am interested in Machine Learning, should I still do the Foundations of Data Science course?

    The Foundations in Data Science course focuses on the basics of statistics and Python programming for data science. These fundamentals are required for many job roles.
    Also, in the machine learning course, we will assume a background in these areas. If you are confident about the topics enlisted in the syllabus, then you can directly join the Machine Learning course that begins later this year.

  • Will i get computational resources for doing my projects?

    No, we do not provide any computational resources. The course platform only hosts the video lectures and assignments. All programming assignments and projects will be done on Google Colaboratory, which is a freely available resource. In the course, we provide a tutorial on how to use Google Colaboratory. It is therefore sufficient to have a standard computer and a good internet connection.

  • How can I clear my doubts?

    You will have access to the PadhAI course community where you can post your queries. Dedicated TAs will answer them. You are also encouraged to interact with your peers and learn together.

  • Do you provide any support with finding jobs?

    While data science is a highly sought after job role, we do not provide any placement guarantee or support.

Watch Intro Video

PadhAI One FAQs

Sample Certificate

This is a sample of the course completion certificate from our Deep Learning Course. You will receive a similar certificate upon completion of this course titled PadhAI One: Foundations of Data Science.
Sample certificate