Dataset - Deepstash
Machine Learning With Google

Learn more about artificialintelligence with this collection

Understanding machine learning models

Improving data analysis and decision-making

How Google uses logic in machine learning

Machine Learning With Google

Discover 95 similar ideas in

It takes just

14 mins to read

Dataset

For an ML model to learn and generalize, there is a need for a large and high-quality dataset. Therefore, we created the ManyTypes4Py dataset which contains 5.2K Python projects and 4.2M type annotations. To avoid data leakage from the training set to the test set, we used our CD4Py tool to perform code de-duplication in the dataset. To know more about how we created the dataset, check out its paper .

4

25 reads

MORE IDEAS ON THIS

Feature Extraction

As with most ML tasks, we need to find a set of relevant features in order to predict type annotations. Here, we consider features as type hints. Specifically, we extract three kinds of type hints, namely, identifiers, code context, and visible type hints (VTHs).

To learn from the extracted...

4

14 reads

Releasing

We have two different environments for development and production as it is common in software development. In the development env., we test, debug, and profile Type4Py’s server-side components before releasing new features/fixes into the production code.

Also, whenever we train a new Type4...

4

6 reads

Implementation

To extract type hints, we first extract Abstract Syntax Trees (ASTs) and perform light-weight static analysis using our LibSA4Py package. NLP tasks are applied using NLTK. To train the Word2Vec model, w...

4

7 reads

Development and Release of Type4Py: Machine Learning-based Type Auto-completion for Python

Dynamic programming languages like Python and TypeScript allows developers to optionally define type annotations and benefit from the advantages of static typing such as better code completion, early bug detection.

However, retrofitting types is a cumbersome and error-prone process. To addr...

4

46 reads

Overview

Overview

Before going into the detail of the Type4Py’s model and its implementation, it would be helpful to see the overview of Type4Py. In general, there is a VSCode extension at the client-side (developers) and the Type4Py model and its pipeline are deployed on our servers. Simply, the extension sends a...

4

28 reads

Roadmap

So far, I have described the current state of Type4Py. For future work, here is our roadmap:

  • Enabling the type-checking process for the Type4Py’s predictions using mypy , preferably at the client-side.
  • Releasing a local v...

4

11 reads

Deployment

To deploy the Type4Py model for the production environment, we convert the pre-trained PyTorch model to an ONNX model which allows us to query the model on both GPUs and CPUs with very fast inference speed and low...

4

5 reads

Model Architecture & Training

Model Architecture & Training

The basic idea is that two RNN captures different aspects of input sequences from both identifiers and code context.

Next, the output of two RNNs is concatenated into a single vector, which is passed through a fully-connected linear layer.

The final linear layer maps the learned typ...

4

12 reads

VSCode Extension

VSCode Extension

The Type4Py’s VSCode extension is small and simple.

Type slots are functions parameters, return types, and variables, which are located based on the line and column numbers. Currently, type prediction can be triggered via Command Pallete or by enabling the AutoInfer setting, which predicts ...

4

14 reads

CURATED FROM

CURATED BY

decebaldobrica

#engineering, #machinelearning and #crypto

Related collections

More like this

Python data processing with pandas

Pandas is a Python language package, which is used for data processing. This is a very common basic programming library when we use Python language for machine learning programming. This article is an introductory tutorial to it. Pandas provide fast, flexible and expressive data structures with t...

stash-superman-illustration

Explore the World’s

Best Ideas

200,000+ ideas on pretty much any topic. Created by the smartest people around & well-organized so you can explore at will.

An Idea for Everything

Explore the biggest library of insights. And we've infused it with powerful filtering tools so you can easily find what you need.

Knowledge Library

Powerful Saving & Organizational Tools

Save ideas for later reading, for personalized stashes, or for remembering it later.

# Personal Growth

Take Your Ideas

Anywhere

Organize your ideas & listen on the go. And with Pro, there are no limits.

Listen on the go

Just press play and we take care of the words.

Never worry about spotty connections

No Internet access? No problem. Within the mobile app, all your ideas are available, even when offline.

Get Organized with Stashes

Ideas for your next work project? Quotes that inspire you? Put them in the right place so you never lose them.

Join

2 Million Stashers

4.8

5,740 Reviews

App Store

4.7

72,690 Reviews

Google Play

Ashley Anthony

This app is LOADED with RELEVANT, HELPFUL, AND EDUCATIONAL material. It is creatively intellectual, yet minimal enough to not overstimulate and create a learning block. I am exceptionally impressed with this app!

Shankul Varada

Best app ever! You heard it right. This app has helped me get back on my quest to get things done while equipping myself with knowledge everyday.

Sean Green

Great interesting short snippets of informative articles. Highly recommended to anyone who loves information and lacks patience.

samz905

Don’t look further if you love learning new things. A refreshing concept that provides quick ideas for busy thought leaders.

Jamyson Haug

Great for quick bits of information and interesting ideas around whatever topics you are interested in. Visually, it looks great as well.

Ghazala Begum

Even five minutes a day will improve your thinking. I've come across new ideas and learnt to improve existing ways to become more motivated, confident and happier.

Laetitia Berton

I have only been using it for a few days now, but I have found answers to questions I had never consciously formulated, or to problems I face everyday at work or at home. I wish I had found this earlier, highly recommended!

Giovanna Scalzone

Brilliant. It feels fresh and encouraging. So many interesting pieces of information that are just enough to absorb and apply. So happy I found this.

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving & library

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Personalized recommendations

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates