Natural Language Processing: Word Formation - Deepstash

Natural Language Processing: Word Formation

Morphology is the study of word structure.

Words consist of one or more morphemes (cats= cat+ s). A morpheme is the smallest unit of language.

Morphemes can be of several types:

  • stem (can stand on its own)
  • affix (can't stand on its own)

Word formation can develop through:

  • inflection: forms of the same word. E.g. word, words; work, worked
  • derivation: not applicable to all words in a class; the meaning changes: e.g. act -> actor
  • compunds: stardust = star + dust
  • clitisication: word + clitic. E.g.: we're, you're

For counting words:

  • tokens: distinct occurences of word strings
  • types: distinct words

Lemmatization is a vocabulary reduction process of mapping words to their stem. E.g. sang, sung, sings to sing  

Stemming is the process of reducing words to stems. E.g. information to inform, retrieval to retriev

Types of errors:

  • ommision: related words are not reduced to the same stem. E.g. European and Europe
  • commision: unrelated words reduced to the same stems. E.g. policy and police are reduced to polic 

Note: look at Minimum Distance Algorithm to determine the distance between words

22

26 reads

CURATED FROM

IDEAS CURATED BY

magdamihalache

User Researcher, passionate about behaviours and building the right products. I 'stash' about research, self-development and education.

The idea is part of this collection:

The Psychology of Willpower

Learn more about scienceandnature with this collection

How to strengthen your willpower

How to overcome temptation and distractions

The role of motivation in willpower

Related collections

Similar ideas to Natural Language Processing: Word Formation

Natural Language Processing: Regular Expressions

  • /word/ - matches any string containing the substring "word". Note: it's case sensitive.
  • /[ab]/ - disjunction of characters: "a" or" b"
  • /cat|dog/ - disjunction of string cat and dog. To be read as: String containing "cat" or "dog"
  • /pupp(y|ies)/ - To be read as: string ...

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates