Data preprocessing - Deepstash
Machine Learning With Google

Learn more about artificialintelligence with this collection

Understanding machine learning models

Improving data analysis and decision-making

How Google uses logic in machine learning

Machine Learning With Google

Discover 95 similar ideas in

It takes just

14 mins to read

Data preprocessing

In the preprocessing step, you transform valid, clean data into the format that best suits the needs of your model. Here are some examples of data preprocessing:

  • Normalizing numeric data to a common scale.
  • Applying formatting rules to data. For example, removing the HTML tagging from a text feature.
  • Reducing data redundancy through simplification. For example, converting a text feature to a bag of words representation.
  • Representing text numerically. For example, assigning values to each possible value in a categorical feature.
  • Assigning key values to data instances.

133

839 reads

MORE IDEAS ON THIS

Train, evaluate, and tune your model

AI Platform provides the services you need to train and evaluate your model in the cloud. In addition, AI Platform offers hyperparameter tuning functionality to optimize the training process.

When training your model, you feed it data for which you already know the value for your target dat...

121

565 reads

Data analysis

Having sourced your data, you must analyze and understand the data and prepare it to be the input to the training process. For example, you may need to perform the following steps:

  • Join data from multiple sources and rationalize it into one dataset.
  • Visualize the data to look f...

130

919 reads

Source and prepare your data

You must have access to a large set of training data that includes the attribute (called a feature in ML) that you want to be able to infer (predict) based on the other features.

For example, assume you want your model to predict the sale price of a house. Begin with a large set of data des...

128

1.02K reads

Code your model

Develop your model using established ML techniques or by defining new operations and approaches.

Start learning by working through TensorFlow's getting started guide. You can also follow the scikit-learn documentation or the XGBoost documentation to create your model. Then examine some code...

133

679 reads

A brief description of machine learning

Machine learning (ML) is a subfield of artificial intelligence (AI). The goal of ML is to make computers learn from the data that you give them. Instead of writing code that describes the action the computer should take, your code provides an algorithm that adapts based on examples of intended be...

137

1.78K reads

Manage your models and model versions

AI Platform provides various interfaces for managing your model and versions, including a REST API, the gcloud ai-platform command-line tool, and...

128

624 reads

Testing your model

During training, you apply the model to known data to adjust the settings to improve the results. When your results are good enough for the needs of your application, you should deploy the model to whatever system your application uses and test it.

To test your model, run data through it in...

123

515 reads

Google Cloud services

  • Vertex AI Workbench user-managed notebooks are Deep Learning VM Images instances pre-packaged with JupyterLab notebooks and optimized for deep learning data science tasks, from data preparation and exploration to quick prototype development.
  • BigQuery is a fully managed data warehouse...

139

750 reads

Monitor your prediction service

Monitor the predictions on an ongoing basis. AI Platform provides APIs to examine running jobs. In addition, various Google Cloud tools support the operation of your deployed model, such as Cloud Logging and Cloud Monitoring.

120

519 reads

Host your model in the cloud

AI Platform provides tools to upload your trained ML model to the cloud, so that you can send prediction requests to the model.

In order to deploy your trained model on AI Platform, you must save your trained model using the tools provided by your machine learning framework. This involves s...

122

499 reads

Evaluate the problem

Before you start thinking about how to solve a problem with ML, take some time to think about the problem you are trying to solve. Ask yourself the following questions:

  • Do you have a well-defined problem to solve? Many different approaches are possible when using ML ...

132

1.09K reads

Google Cloud support for data exploration and preparation

TensorFlow has several preprocessing libraries that you can use with AI Platform. For example, tf.transform.

You can deploy and serve scikit-learn pipelines on AI Platform to apply built-in transforms for training and online prediction. Applying custom transformations is in beta.

You ...

132

802 reads

Send prediction requests to your model

AI Platform provides the services you need to request predictions from your model in the cloud.

There are two ways to get predictions from trained models: online prediction (sometimes called HTTP prediction) and batch prediction. In both cases, you pass input data to a cloud-hosted machine-...

121

525 reads

The ML workflow

The ML workflow

To develop and manage a production-ready model, you must work through the following stages:

  • Source and prepare your data.
  • Develop your model.
  • Train an ML model on your data:

  • Train model

  • Evaluate model accuracy
  • Tune hyper...

156

1.31K reads

CURATED FROM

CURATED BY

leverett

Improving myslef every day.

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving & library

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Personalized recommendations

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates