9: Term Frequency-Inverse Document Frequency - Deepstash

9: Term Frequency-Inverse Document Frequency

Term Frequency (TF) divides the number of times a word appears in a document by the total number of words in that document. Thus the word seal appearing once in a thousand-word article has a term frequency of 0.001. By itself, TF is largely useless as an indicator of term importance, due to the fact that meaningless articles (such as aandthe, and it) predominate.

TF-IDF is a powerful and popular method for initial filtering passes in Natural Language Processing frameworks.

159

228 reads

CURATED FROM

IDEAS CURATED BY

The Math Of Machine Learning

The idea is part of this collection:

Introduction to Web 3.0

Learn more about computerscience with this collection

The differences between Web 2.0 and Web 3.0

The future of the internet

Understanding the potential of Web 3.0

Related collections

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates