So you want to teach computers - Deepstash

Explore the World's Best Ideas

Join today and uncover 100+ curated journeys from 50+ topics. Unlock access to our mobile app with extensive features.

The Nine Circles of Machine Learning

The Nine Circles of Machine Learning

Are you interested in Machine Learning? Unsure if you should invest the time into learning more about it?

Well, here are the key steps in any Machine Learning workflow:

  • The Dreaded Dataset
  • Data Preprocessing from Hell
  • Finally Develop the Model
  • Give the Model a Workout
  • Evaluate your Unfortunate Results
  • Do it all over again. But better.

Sorry, there aren’t nine. But the joke will make sense later!

1

9 reads

Define the Problem

Define the Problem

I lied, there’s one more step. Before anything else, you must define the problem your machine learning algorithm is going to solve. Or try to solve.

Is it related to text, images, video? Sound? You can mix and match, too.

And is it classification, segmentation? Detection?

This will inform the dataset you’re looking for or — God forbid — building. And also which type of model you should use.

1

8 reads

The Dreaded Dataset

The Dreaded Dataset

Here’s where to look for a dataset:

  • Dataset search engines: kaggle and google dataset search.
  • Papers with similar problems: google scholar and papers with code (pwc is a surprise tool that will help us later).

If kaggle shrugged and google laughed at you, if none of the researchers made their datasets available, or if you’re a trailblazer thinking of using data no one has gathered before… Time to get your hands dirty. Set up some crawlers, scour google and youtube. And may the force be with you.

1

6 reads

Data Preprocessing from Hell

Data Preprocessing from Hell

Preprocessing involves preparing the data, making sure it’s in the format the model requires.

If you’re working with images, always normalise!! This isn’t the first or the most important step, but it’s the one that always gets me. And use open CV to turn images into arrays.

If you’re not, I can’t help you here. Google and stack overflow will guide you.

1

7 reads

Finally Develop the Model

Finally Develop the Model

Congratulations! Like Dante, you have muddled through the 9 rings of eternal damnation and survived to tell the tale. These are the pearly gates, this is the fun part.

Someone said once that there’s nothing new under the Sun, I think. And they were right. The first thing you should do is try and replicate models from papers with code. See, I told you it would help us.

Papers with code kind enough to sort papers by performance and provide github links. All you have to do is clone the repository and get the model working, try it out with your dataset, and adjust it as needed.

1

3 reads

Give the Model a Workout

Give the Model a Workout

It’s time to train the model. Hopefully you set data aside for testing. My bad, I should have mentioned that earlier.

Usually we do 70% of the data for training, and 30% for testing. You can also have 20% for testing and 10% as a hold out.

Training can happen in seconds, or it can take days. It all depends on how complex your data and your model are.

This is the part where you sit back and relax. Grab some coffee, take a nap.

1

5 reads

Evaluate your Unfortunate Results

Evaluate your Unfortunate Results

Hush, now! It’s okay. If it were perfect that would mean there’s no more fun to be had!

As far as metrics go, I like to go all out and catch them all. Keep in mind that if you have an unbalanced dataset (where one class has much less data) the accuracy will be misleading. Look at the sensitivity, specificity, all of those.

1

4 reads

Do it all over again. But better.

Do it all over again. But better.

Check the training metrics.

Are they much better than the test metrics? That means the model learned a bit too well. It memorised the data and doesn’t perform as well on new information. You need more data, or a different model altogether.

Now if the training metrics suck? That means the model sucked. You need to tweak it. Add more epochs or layers, change the loss or activation function. If nothing works, it might be time to switch out the model.

This is called overfitting and underfitting, if you want to be fancy.

1

3 reads

Go forth and experiment!

Go forth and experiment!

All jokes aside, the joyous world of Machine Learning is well within your reach. It’s not as hard as it seems, and it is just as fun as I made it seem. Pinky promise.

And you’re sure to have a large community of people banging their heads over the same exact problems you’re having.

1

4 reads

IDEAS CURATED BY

morgan.g

comp sci student just trying to survive

CURATOR'S NOTE

This is as much for me as it is for you. I love ranting about my job.

Other curated ideas on this topic:

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates