Machine learning workflow | AI Platform | Google Cloud

Curated from: cloud.google.com

Ideas, facts & insights covering these topics:

Artificial Intelligence

15 ideas

21.1K reads

Explore the World's Best Ideas

Join today and uncover 100+ curated journeys from 50+ topics. Unlock access to our mobile app with extensive features.

A brief description of machine learning

Machine learning (ML) is a subfield of artificial intelligence (AI). The goal of ML is to make computers learn from the data that you give them. Instead of writing code that describes the action the computer should take, your code provides an algorithm that adapts based on examples of intended behavior. \

The resulting program, consisting of the algorithm and associated learned parameters, is called a trained model.

283

3.15K reads

The ML workflow

To develop and manage a production-ready model, you must work through the following stages:

Source and prepare your data.
Develop your model.
Train an ML model on your data:
Train model
Evaluate model accuracy
Tune hyperparameters
Deploy your trained model.
Send prediction requests to your model:
Online prediction
Batch prediction
Monitor the predictions on an ongoing basis.
Manage your models and model versions.

These stages are iterative. You may need to reevaluate and go back to a previous step at any point in the process.

309

2.4K reads

Evaluate the problem

Before you start thinking about how to solve a problem with ML, take some time to think about the problem you are trying to solve. Ask yourself the following questions:

Do you have a well-defined problem to solve? Many different approaches are possible when using ML to recognize patterns in data.
Is ML the best solution for the problem? Supervised ML (the style of ML described in this documentation) is well suited to certain kinds of problems.
How can you measure the model's success? One of the biggest challenges of creating an ML model is knowing when the model development phase is complete.

275

1.9K reads

Source and prepare your data

You must have access to a large set of training data that includes the attribute (called a feature in ML) that you want to be able to infer (predict) based on the other features.

For example, assume you want your model to predict the sale price of a house. Begin with a large set of data describing the characteristics of houses in a given area, including the sale price of each house.

271

1.75K reads

Data analysis

Having sourced your data, you must analyze and understand the data and prepare it to be the input to the training process. For example, you may need to perform the following steps:

Join data from multiple sources and rationalize it into one dataset.
Visualize the data to look for trends.
Use data-centric languages and tools to find patterns in the data.
Identify features in your data. Features comprise the subset of data attributes that you use in your model.
Clean the data to find any anomalous values caused by errors in data entry or measurement.

276

1.57K reads

Data preprocessing

In the preprocessing step, you transform valid, clean data into the format that best suits the needs of your model. Here are some examples of data preprocessing:

Normalizing numeric data to a common scale.
Applying formatting rules to data. For example, removing the HTML tagging from a text feature.
Reducing data redundancy through simplification. For example, converting a text feature to a bag of words representation.
Representing text numerically. For example, assigning values to each possible value in a categorical feature.
Assigning key values to data instances.

274

1.41K reads

Google Cloud support for data exploration and preparation

TensorFlow has several preprocessing libraries that you can use with AI Platform. For example, tf.transform.

You can deploy and serve scikit-learn pipelines on AI Platform to apply built-in transforms for training and online prediction. Applying custom transformations is in beta.

You can deploy a custom prediction routine (beta) to make sure AI Platform preprocesses input at prediction time in the same way that you preprocessed data during training.

280

1.33K reads

Google Cloud services

Vertex AI Workbench user-managed notebooks are Deep Learning VM Images instances pre-packaged with JupyterLab notebooks and optimized for deep learning data science tasks, from data preparation and exploration to quick prototype development.
BigQuery is a fully managed data warehouse service that allows ad hoc analysis on real-time data with standard SQL.
Dataproc is a fully-managed cloud service for running Apache Spark and Apache Hadoop clusters.

284

1.23K reads

Code your model

Develop your model using established ML techniques or by defining new operations and approaches.

Start learning by working through TensorFlow's getting started guide. You can also follow the scikit-learn documentation or the XGBoost documentation to create your model. Then examine some code samples designed to work with AI Platform.

280

1.12K reads

Train, evaluate, and tune your model

AI Platform provides the services you need to train and evaluate your model in the cloud. In addition, AI Platform offers hyperparameter tuning functionality to optimize the training process.

When training your model, you feed it data for which you already know the value for your target data attribute (feature). You run the model to predict those target values for your training data, so that the model can adjust its settings to better fit the data and thus to predict the target value more accurately.

258

937 reads

Testing your model

During training, you apply the model to known data to adjust the settings to improve the results. When your results are good enough for the needs of your application, you should deploy the model to whatever system your application uses and test it.

To test your model, run data through it in a context as close as possible to your final application and your production infrastructure.

Use a different dataset from those used for training and evaluation. Ideally, you should use a separate set of data each time you test, so that your model is tested with data that it has never processed before.

260

865 reads

Host your model in the cloud

AI Platform provides tools to upload your trained ML model to the cloud, so that you can send prediction requests to the model.

In order to deploy your trained model on AI Platform, you must save your trained model using the tools provided by your machine learning framework. This involves serializing the information that represents your trained model into a file which you can deploy for prediction in the cloud.

Then you upload the saved model to a Cloud Storage bucket, and create a model resource on AI Platform, specifying the Cloud Storage path to your saved model.

260

819 reads

Send prediction requests to your model

AI Platform provides the services you need to request predictions from your model in the cloud.

There are two ways to get predictions from trained models: online prediction (sometimes called HTTP prediction) and batch prediction. In both cases, you pass input data to a cloud-hosted machine-learning model and get inferences for each data instance.

257

838 reads

Monitor your prediction service

Monitor the predictions on an ongoing basis. AI Platform provides APIs to examine running jobs. In addition, various Google Cloud tools support the operation of your deployed model, such as Cloud Logging and Cloud Monitoring.

257

803 reads

Manage your models and model versions

AI Platform provides various interfaces for managing your model and versions, including a REST API, the gcloud ai-platform command-line tool, and the Cloud Console.

266

978 reads

IDEAS CURATED BY

Lise Everett

@leverett

Improving myslef every day.

Lise Everett's ideas are part of this journey:

Learn more about artificialintelligence with this collection

Machine Learning With Google

Understanding machine learning models

Improving data analysis and decision-making

How Google uses logic in machine learning

Related collections

Introduction to Web 3.0

Metaverse

Hiring Without an Office

The Podcasting Ecosystem

Similar ideas

10 ideas

Machine Learning-based Type Auto-completion for Python – The Blog of Amir Mir

mirblog.net

8 ideas

How to build a data science and machine learning roadmap in 2022

venturebeat.com

5 ideas

The fourth industrial revolution: a primer on Artificial Intelligence (AI)

medium.com

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

—

100+ Learning Journeys

—

Access to 200,000+ ideas

—

Access to the mobile app

—

Unlimited idea saving

—

Unlimited history

—

Unlimited listening to ideas

—

Downloading & offline access

—

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

I agree to receive email updates

deepstash

Content

Ideas

Collections

Stories

Explore

Product

Pricing

Businesses

Resources

Terms

Privacy

Press Kit

Sitemap

Company

About

Contact