Term Frequency (TF) divides the number of times a word appears in a document by the total number of words in that document. Thus the word seal appearing once in a thousand-word article has a term frequency of 0.001. By itself, TF is largely useless as an indicator of term importance, due to the fact that meaningless articles (such as a, and, the, and it) predominate.
TF-IDF is a powerful and popular method for initial filtering passes in Natural Language Processing frameworks.
149
215 reads
CURATED FROM
IDEAS CURATED BY
The Math Of Machine Learning
“
The idea is part of this collection:
Learn more about computerscience with this collection
The differences between Web 2.0 and Web 3.0
The future of the internet
Understanding the potential of Web 3.0
Related collections
Read & Learn
20x Faster
without
deepstash
with
deepstash
with
deepstash
Personalized microlearning
—
100+ Learning Journeys
—
Access to 200,000+ ideas
—
Access to the mobile app
—
Unlimited idea saving
—
—
Unlimited history
—
—
Unlimited listening to ideas
—
—
Downloading & offline access
—
—
Supercharge your mind with one idea per day
Enter your email and spend 1 minute every day to learn something new.
I agree to receive email updates