Assignment Two

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Jimma University

Jimma Institute of Technology


Faculty of Computing
Department of Information Technology
Msc in Information Technology
Course Code: - CMIT 6122
Course Title: - Natural Language Processing
Given Work: - Assignment Two

Submitted To: - Getachew Mammo (PhD)


Prepared by: -Zerihun Tadesse Gebre
ID No: - RM 2903/12-0

Jimma University
ZERIHUN
Why Important Morphological analysis

Morphological analysis is the process of providing grammatical information about the word on
the basis of properties of the morpheme it contains. It is an integral part of the larger natural
language processing projects such as text to speech synthesis, information extraction and
machine translation. It is the sub discipline of linguistics that deals with the internal structure of
words. Example: - Consider the following sets of English word pairs:

Verb Noun
Bake Baker
Eat Eater
Run Runner
Write Writer
In these word pairs we observe a systematic form-meaning correspondence: the presence of –er
in the words in the right column correlates with the meaning component ‘one who Vs’ where V
stands for the meaning of the corresponding verb in the left column.

Also morphological analysis is very meaningful for the determination of part-of-speech structure
in syntactic parsing, and analysis of a sentence. Information about verbal inflection is especially
important for the word order concept. Moreover, a word may define two or more expressions.
The different parts of the word represent the smallest units of meaning known as Morphemes.

Morphology which comprise of Nature of words, are initiated by morphemes. An example of

Morpheme could be, the word precancellation can be morphologically scrutinized into three
separate morphemes: the prefix pre, the root cancella, and the suffix -tion. The interpretation of
morpheme stays same across all the words, just to understand the meaning humans can
break any unknown word into morphemes. For example, adding the suffix –ed to a verb,
conveys that the action of the verb took place in the past. The words that cannot be divided and
have meaning by themselves are called Lexical morpheme (e.g.: table, chair).The words (e.g.
-ed, -ing, -est, -ly, -ful) that are combined with the lexical morpheme are known as
Grammatical morphemes (eg. Worked, Consulting, Smallest, Likely, Use). Those grammatical
morphemes that occurs in combination called bound morphemes ( e.g. -ed, -ing).
Morphological analyzer and generator are the two essential and basic tools for building
any natural language processing application. It supplies information concerning
morphosyntactic properties of the words it analyses or constructs.

Morphological analyzer is a program for analyzing the morphology of an input word, the
analyzer reads the inflected surface form of each word in a text and provides its lexical form,
like for nouns it will provide gender, number, and case information, likewise for verbs it
will provide tense, aspect and modularity. Whereas generation is the inverse process i.e., given
a root and its grammatical features it will generate the word forms of the root word.

Also morphological analyzer is the program for analyzing the morphology of an input word. The
analyzer includes the recognition engine, identifying suffixes, and finding a stem within the input
word algorithms. A morphological analyzer takes a complete word form and the syntactic and
morphological properties of the word as its input. Morphological analyzers are composed of
three parts.

 Morpheme lexeme
 Set of rules governing the spelling and composition of morphologically complex words.
 Decision algorithm

Why Important Part-of-Speech tagging?


Part of speech tagging is the basic step of identifying a token’s functional role within a sentence
and is the fundamental step in any NLP pipeline. It is the process of assigning a part-of-speech to
each word in a sentence.

Example
Word Tag
Heat verb (noun)
Water noun (verb)
In prep (noun, adv)
A det (noun)
Large adj (noun)
Vessel noun
Part-of-speech tagging is the process of assigning a part-of-speech marker to each part-of-
speech tagging word in an input text. The input to a tagging algorithm is a sequence of
(tokenized) words and a tagset, and the output is a sequence of tags, one per token. Tagging is a
disambiguation task; words are ambiguous —have more than one ambiguous possible part-of-
speech—and the goal is to find the correct tag for the situation. For example, book can be a verb
(book that flight) or a noun (hand me that book). That can be a determiner (Does that flight serve
dinner) or a complementizer (I thought that your flight was earlier). The goal of POS-tagging is
to resolve these ambiguity resolution ambiguities, choosing the proper tag for the context.

Part-of-Speech tagging in itself may not be the solution to any particular NLP problem. It is
however something that is done as a pre-requisite to simplify a lot of different problems. Part of
Speech (hereby referred to as POS) Tags are useful for building parse trees, which are used in
building NERs (most named entities are Nouns) and extracting relations between words. POS
Tagging is also essential for building lemmatizers which are used to reduce a word to its root
form.

Example

Let us consider a few applications of POS tagging in various NLP tasks.

Text to Speech Conversion

 They refuse to permit us to obtain the refuse permit.


The word refuse is being used twice in this sentence and has two different meanings
here. refUSE (/rəˈfyo͞oz/) is a verb meaning “deny,” while REFuse(/ˈrefˌyo͞os/) is a noun
meaning “trash” (that is, they are not homophones). Thus, we need to know which word is being
used in order to pronounce the text correctly. (For this reason, text-to-speech systems usually
perform POS-tagging.)

Word Sense Disambiguation


Words often occur in different senses as different parts of speech. For example:

 She saw a bear.


 Your efforts will bear fruit.
The word bear in the above sentences has completely different senses, but more importantly one
is a noun and other is a verb. Rudimentary word sense disambiguation is possible if you can tag
words with their POS tags.
Word-sense disambiguation (WSD) is identifying which sense of a word (that is, which
meaning) is used in a sentence, when the word has multiple meanings.

You might also like