A Guide On How To Become A Data Scientist (Step By Step Approach) - KDnuggets - Deepstash
A Guide On How To Become A Data Scientist (Step By Step Approach) - KDnuggets

A Guide On How To Become A Data Scientist (Step By Step Approach) - KDnuggets

Curated from: kdnuggets.com

Ideas, facts & insights covering these topics:

8 ideas


2.2K reads


Explore the World's Best Ideas

Join today and uncover 100+ curated journeys from 50+ topics. Unlock access to our mobile app with extensive features.

Becoming A Data Scientist

Becoming A Data Scientist

Becoming a Data Scientists is an exciting path, but you cannot learn data science within one year or six months—instead, it’s a lifetime process that you have to follow with proper dedication and hard work.

To guide your journey, the skills outlined here are the first you must acquire to become a data scientist.


575 reads

Choose A Programming Language (Python / R)

Python is the most preferred coding language and is adopted by most Data Scientists. It is easy to understand, versatile, and supports various in-built libraries such as Numpy, Pandas, MatplotLib, Seaborn, Scipy, and many more.

  • FreeCodeCamp’s Python Tutorial (Recommended)
  • Kaggle’s Python Course
  • Krish Naik’s Python Tutorial (Recommended)
  • Udemy’s Python for Data Science and Machine Learning Bootcamp
  • Coursera Python Course

While learning Python, one should know essential Python variables, data types, OOPs concepts, Numpy, Pandas, Matplotlib, and Seaborn.


330 reads

Knowledge Of Statistics And Probability

For becoming a Data Scientist, having knowledge of statistics and probability is as essential as having salt in food. Knowing them will help the data scientists interpret large data sets, get insights from them, and analyze them better.

  • Krish Naik’s Statistics Playlist (Recommended)
  • Coursera Statistics Course
  • Khan Academy Statistics And Probability Course
  • FreeCodeCamp Statistics Course (Recommended)


300 reads

Learn SQL

Structured Query Language (SQL) is used for extracting and communicating with large databases. One should focus on understanding the different types of normalization, writing nested queries, using co-related questions, group-by, performing join operations, etc., on the data and extract in raw format. This data will then further be cleaned either in Microsoft Excel or by using Python libraries.

  • Freecodecamp SQL (Recommended)
  • Intro To SQL By Kaggle (Recommended)
  • Advanced SQL By kaggle
  • Edureka’s SQL Playlist


226 reads

Data Cleaning

When a Data Scientist is given a project, the majority of the time goes into cleaning the data set, removing unwanted values, handling missing values. It can be achieved by using some inbuilt python libraries like Pandas and Numpy.

One should also know how to manipulate data using Microsoft Excel.

  • Blog — Cleaning Data Using Python (Recommended)
  • Edureka’s Microsoft Excel Course
  • Learning Pandas By Kaggle (Recommended)


246 reads

Exploratory Data Analysis

Exploratory data analysis is the essential part when talking about data science. The data scientist has many tasks, including finding data patterns, analyzing data, finding the appropriate trends in the data and obtaining valuable insights, etc., from them with the help of various graphical and statistical methods, including:

A) Data Analysis using Pandas and Numpy

B) Data Manipulation

C) Data Visualization

  • Intro To EDA By Code Heroku’s (Recommended)
  • Blog — Performing EDA on Iris Data Set (Recommended)
  • Coursera Course On EDA, Statistics, Probability


170 reads

Learn Machine Learning Algorithms

According to Google, “Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.”

It is the most crucial step in a life cycle of a data scientist where one has to build various models using machine learning algorithms and should be able to predict and come with the most optimum solution to solve any problem.

  • Machine Learning By Andrew NG (Recommended)
  • Deep Learning By Krish Naik


160 reads

Practice On Analytics Vidhya and Kaggle

After acquiring the basics of Data Science, now it’s time to get hands-on experience in its part. There are many online platforms, like Kaggle and Analytics Vidhya, that can provide you with hands-on experience with both beginner and advanced level data sets. They can help you to understand various machine learning algorithms, different analyzing techniques, etc.


202 reads



“Talent wins games, but teamwork and intelligence win championships.”, Michael Jordan

John Q.'s ideas are part of this journey:

How To Give And Receive Constructive Criticism

Learn more about career with this collection

Understanding the importance of constructive criticism

How to receive constructive criticism positively

How to use constructive criticism to improve performance

Related collections

Read & Learn

20x Faster





Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.


I agree to receive email updates