Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

NLP Unit 2

Language Syntax and Semantics

Avanti Banginwar
Soham Bhirange
Yash Dhumal
Tanmay Bhagwat
Purnendu Kale
Rutvik Kakde
Aditya Bodake
Syllabus

Morphological Analysis: What is Morphology? Types of Morphemes, Inflectional


morphology & Derivational morphology, Morphological parsing with Finite State
Transducers (FST)

Syntactic Analysis: Syntactic Representations of Natural Language, Parsing Algorithms,


Probabilistic context-free grammars, and Statistical parsing

Semantic Analysis: Lexical Semantic, Relations among lexemes & their senses –
Homonymy,Polysemy, Synonymy, Hyponymy, WordNet, Word Sense Disambiguation
(WSD), Dictionary based approach, Latent Semantic Analysis
Morphological Analysis

Morphology is the study of the structure and


formation of words in a language. It deals with the
smallest units of meaning within a language,
known as morphemes, and how these morphemes
combine to create words. Morphology plays a
crucial role in understanding how words are
formed and how their meanings can be modified.
What is Morphology?

● Morphological analysis is a field of linguistics that studies the structure of words. It


identifies how a word is produced through the use of morphemes. A morpheme is a
basic unit of the English language. The morpheme is the smallest element of a word
that has grammatical function and meaning.
● Morphology is the study of the internal structure and functions of the words, and how the
words are formed from smaller meaningful units called morphemes.
● Morphology, the study of the correspondences between grammatical information,
meaning, and form inside words, is one of the central linguistic disciplines.
Morphemes

The morpheme is the smallest element of a word that has grammatical function and meaning.
There are two main types of morphemes:

● Free Morphemes: These can stand alone as words and carry meaning by themselves (e.g.,
"dog," "run," "happy").

● Bound Morphemes: These cannot stand alone and need to be attached to free morphemes to
convey meaning. Bound morphemes include prefixes (e.g.,"un-" in "undo"), suffixes (e.g., "-ed"
in "walked"), and infixes (inserted within a word, rare in English).
Types of Morphemes

There are two types of Morphemes:

● Inflectional Morphemes: These morphemes do not change the fundamental meaning of a word but
indicate grammatical information such as tense, number, gender, or case. Examples in English include
the plural "-s" and the past tense "-ed."

● Derivational Morphemes: These morphemes alter the meaning or part of speech of a word. For
example, the addition of the derivational suffix "-er" to "teach" forms "teacher," changing the verb to a
noun.
Inflectional Morphology:

Primarily concerned with grammatical relationships and the modification of a word's form to indicate
aspects like tense, mood, case, etc. Inflectional morphemes are usually suffixes and don't change the
word's category or meaning dramatically.
Function: Inflectional morphology involves the modification of a word to convey grammatical
information such as tense, aspect, mood, number, gender, case, and comparison.

Examples:

Adding "-s" to "cat" to indicate plurality (cat-s).

Changing "run" to "ran" to indicate past tense.


Derivational morphology

Focuses on creating new words or altering the meaning or lexical category of existing words.
Derivational morphemes are often prefixes or suffixes that result in a more substantial change in
meaning.

Function: Derivational morphology focuses on the creation of new words by adding affixes (prefixes,
suffixes) to a base word. It often changes the lexical category or meaning of the base word.

Examples:

Adding "-er" to "teach" to form "teacher."

Adding "un-" to "happy" to form "unhappy."


Morphological Parsing Using Finite State Transducers(FST)
Morphological parsing involves breaking down a word into its constituent morphemes. Finite State
Transducers (FSTs) are computational models used in natural language processing for morphological
analysis. FSTs are designed to recognize and generate sequences of symbols, making them suitable
for morphological parsing.

In the context of morphological parsing:

● Recognition: FSTs can recognize whether a given word is valid in a language and decompose it
into its morphemes.
● Generation: FSTs can generate valid words by combining morphemes according to the
language's morphological rules.

Finite State Transducers are employed in various applications, such as spell checking, machine
translation, and information retrieval, where understanding the structure of words is crucial for
processing natural language.
Syntactic Representations of Natural Language

Syntactic representations in natural language refer to the ways we can structure and analyze sentences
based on their grammatical components. Think of it as a way to break down sentences into their
essential parts to understand how words relate to each other.

Syntax: Syntax is like the grammar or structure of a sentence. It's the set of rules that govern how
words can be combined to form meaningful sentences.

Syntactic Representation: This is a way of showing the structure of a sentence using symbols or
diagrams. It helps us visualize how words are connected and organized in a sentence.
Example: Let's take a simple sentence like "The cat is on the mat." In syntactic representation, you
might show it as a tree-like structure

is

/ \

cat on

mat
Parsing algorithms

● Parsing algorithms in Natural Language Processing (NLP) are like detectives that help
computers understand the structure and meaning of sentences. They break down sentences into
smaller parts, revealing how words are connected and what roles they play in a sentence.Here's
a simple explanation:
● Sentence Structure: Sentences have a structure, just like a building has a blueprint. Parsing
algorithms figure out this blueprint for sentences.
● Breaking it Down: Imagine a sentence as a puzzle, and each word is a puzzle piece. Parsing
algorithms work to put these pieces together correctly. They identify the subject, the action, the
object, and so on.
Example: Let's take the sentence "The dog chased the ball." A parsing algorithm might break it down
like this:

Subject: The dog ,Action: chased , Object: the ball

The algorithm figures out how these parts fit together to create a meaningful sentence.

Parsing Tree: Sometimes, parsing algorithms represent this structure as a tree, showing the
relationships between words. Each branch of the tree represents a connection between words.

chased

/ \

dog ball
Probabilistic context-free grammar

Grammar:
● In language, grammar refers to the set of rules that dictate how words can be combined to
create sentences. It includes rules for sentence structure, word order, and relationships
between words.
Context-Free Grammar (CFG):
● This is a type of grammar where the structure of a sentence is defined by rules that apply
regardless of the context. It means the rules for forming sentences remain the same, no
matter where the words appear in a sentence.
Probabilistic:
● Adding "probabilistic" to the mix means that instead of having strict rules, we assign
probabilities to different ways of constructing sentences. It reflects the likelihood of one
grammatical structure over another.
Probabilistic Context-Free Grammar (PCFG):
● So, a PCFG is a type of grammar that takes into account the probabilities of different
grammatical structures. It allows us to not only say what is grammatically correct but also
to express the likelihood of different ways a sentence can be constructed.
Example:
● Let's say we have the sentence "The cat is on the mat." A PCFG might assign a higher
probability to the structure where "is" connects "cat" and "on" rather than connecting "cat"
and "mat." This reflects the fact that it's more common for "is" to link a subject to a
predicate rather than connecting two objects directly.
Statistical Parsing

● Statistical parsing in Natural Language Processing (NLP) involves using statistical models and
algorithms to automatically analyze and structure sentences based on their syntactic structure.
● The goal of statistical parsing is to generate a syntactic tree or structure that represents the
grammatical relationships between words in a sentence.
● This process is essential for tasks such as information extraction, question answering, and
machine translation.
Applications of Statistical Parsing

Information Extraction:
● One significant application is in Information Extraction, where statistical parsing helps to
discern the relationships between words and entities.
● This is particularly valuable in extracting structured information from unstructured text,
facilitating the identification of key entities, relationships, and events.

Question Answering:
● In Question Answering systems, statistical parsing plays a pivotal role.
● By understanding the syntactic structure of questions, these systems can accurately identify the
components that need to be addressed in generating relevant answers.
● This enhances the overall effectiveness of question-answering applications.
Applications of Statistical Parsing

Grammar Checking:
● The applications extend to Grammar Checking, where statistical parsing is employed to analyze
the syntactic structures of sentences and identify grammatical errors.
● This is particularly useful in providing users with suggestions for improving the correctness of
their written text.

Text Summarization:
● Another application is in Text Summarization, where parsing helps identify the main syntactic
elements and relationships within a document.
● This information is then used to generate concise and coherent summaries, enhancing the
summarization process.
Semantic Analysis

● Lexical Semantic

Lexical semantics is a subfield of natural language processing (NLP) that deals with the meaning of
words and how they combine to form meaningful phrases and sentences.

It focuses on understanding the relationships between words and the meanings they convey in
different contexts.

In the context of NLP, lexical semantics plays a crucial role in tasks such as text understanding,
information retrieval, sentiment analysis, and machine translation.

Lexical semantics helps us figure out what each word means and how they work together in different
situations.
Applications of Lexical Semantic

Information Retrieval:
● In search engines and information retrieval systems, understanding the meanings of words is
essential.
● Lexical semantics helps improve the accuracy of searches by enabling systems to retrieve
documents or information relevant to the intended meaning of user queries.

Question Answering Systems:


● Lexical semantics is vital in developing question-answering systems.
● Understanding the meanings of words in both questions and documents is essential for
accurately retrieving relevant information.
Applications of Lexical Semantic

Text Classification:
● In tasks like document categorization or topic modeling, lexical semantics aids in distinguishing
between different classes or topics based on the meanings of words and their relationships.

Sentiment Analysis:
● Analyzing sentiments in text requires understanding the meanings of words and how they
contribute to the overall sentiment.
● Lexical semantics aids in identifying positive, negative, or neutral sentiments expressed in a
piece of text.
Relations among lexemes & their senses – Homonymy

In natural language processing (NLP), homonymy refers to a situation where two or more words share
the same form (spelling) but have different meanings.

Dealing with homonymy in NLP involves addressing the challenge of distinguishing between these
different meanings when analyzing and understanding text.

Here's how homonymy is relevant in NLP terms:

Named Entity Recognition (NER):

● Homonymy can impact NER tasks, particularly when dealing with entities that share the same
form but represent different things.
● For example, the word "Java" could refer to a programming language or an island in Indonesia.
Relations among lexemes & their senses – Homonymy
Contextual Analysis:

● NLP models often rely on contextual information to understand the meaning of words.
● Homonymy requires systems to analyze the context in which a word appears to discern its
intended sense.

Ambiguity Resolution:

● Homonymy contributes to ambiguity in language, and resolving this ambiguity is crucial for
tasks like information retrieval, sentiment analysis, and machine translation.

Machine Translation:

● Homonymy can introduce challenges in machine translation tasks.


● Translating a sentence accurately requires the correct selection of word senses, especially when
homonymous words have different equivalents in the target language
Polysemy

● Polysemy is another linguistic phenomenon that plays a significant role in natural language
processing (NLP).
● Unlike homonymy, where words with different meanings have the same form, polysemy
involves words with multiple related meanings.
● In the context of NLP, dealing with polysemy is crucial for understanding the refinement
semantics of words.
● Polysemy makes language flexible and interesting. It allows us to use the same word in various
situations without having to create a new word for every little thing. It's like recycling words for
different jobs.
Aspects of Polysemy in the context of NLP

Word Sense Induction:


● NLP systems often use techniques like word sense induction to automatically identify and
categorize the various senses of a polysemous word.
● This involves analyzing large text corpora to discover patterns associated with different
meanings.

Named Entity Recognition (NER):


● Polysemy can complicate NER tasks. For example, the word "Java" could refer to the
programming language or the island in Indonesia.
● Recognizing the correct entity type depends on understanding the specific context.
Aspects of Polysemy in the context of NLP

Evaluation Metrics:
● Assessing the performance of NLP systems in handling polysemy requires appropriate
evaluation metrics.
● Researchers and practitioners need reliable benchmarks to measure how well a system can
disambiguate meanings in real-world scenarios.

Ambiguity Resolution:
● In applications such as machine translation or information retrieval, polysemy introduces
ambiguity.
● NLP systems need to employ techniques to resolve this ambiguity and accurately interpret the
intended meaning.
Synonymy

● Synonymy refers to the relationship between words that have similar meanings, commonly
known as synonyms.
● Synonyms are words that can be used interchangeably in certain contexts because they convey
comparable or identical meanings.
● For example, "happy" and "joyful" are synonyms, as are "big" and "large."
Applications of Synonymy

Text Similarity and Retrieval:


● Understanding synonymy is crucial in tasks such as information retrieval or text similarity.
● Systems need to recognize that different words or phrases with similar meanings should be
considered relevant when searching for or comparing text.

Data Augmentation:
● Synonym substitution can be used as a data augmentation technique in NLP.
● By replacing words with their synonyms, it's possible to generate additional training
data for models, which can help improve their robustness and generalization.
Applications of Synonymy

Word Embeddings:
● In NLP models, words are often represented as vectors in a high-dimensional space, known as
word embeddings.
● Similar words or synonyms tend to have vectors that are close to each other in this space,
making it easier for models to understand their semantic relationships.

Thesaurus Integration:
● Some NLP applications may incorporate the use of thesauri or lexical databases to expand their
understanding of synonym relationships.
● This can help improve the accuracy of tasks like text classification or sentiment analysis.
Hyponymy

● Hyponymy is a linguistic and semantic relationship in which one word represents a


more specific or specialized concept (hyponym), and another word represents a more
general or inclusive concept (hypernym).
● In NLP, understanding hyponymy is essential for tasks such as natural language
understanding, knowledge representation, and semantic analysis.
● For instance, in a sentence like "A dog is a type of mammal," "dog" is the hyponym,
and "mammal" is the hypernym. NLP models that can grasp such relationships can
perform better in tasks involving understanding and generating human-like language.
Aspects of hyponymy in the context of NLP

Word Hierarchy:
● In a hyponymy relationship, words can be organized into a hierarchical structure, where
the hypernym (more general term) encompasses the hyponym (more specific term).
● For example, "rose" is a hyponym of "flower," and "flower" is a hyponym of "plant."

Semantic Representation:
● Many NLP models use semantic embeddings or vector representations to capture the
meaning of words.
● In a well-designed embedding space, hyponyms are expected to be closer to their
hypernyms. This allows the model to understand the hierarchical structure of concepts.
Aspects of hyponymy in the context of NLP

Ontologies and Knowledge Graphs:


● Hyponymy is often represented in ontologies and knowledge graphs used in NLP
applications.
● These knowledge structures organize concepts and their relationships, enabling systems to
infer information and answer queries more effectively.

Semantic Similarity:
● Recognizing hyponymy is crucial for determining semantic similarity between words.
● If two words share a hyponym-hypernym relationship, they are expected to be more
similar in meaning than if they don't.
Aspects of hyponymy in the context of NLP

Sense Disambiguation:
● Understanding hyponymy aids in word sense disambiguation, where the correct meaning
of a word needs to be identified based on its context.
● Recognizing the hierarchical relationships between word senses can improve the accuracy
of disambiguation.

Textual Inference:
● In tasks like textual entailment or paraphrase detection, recognizing hyponymy can help
systems infer relationships between sentences or phrases with different levels of
specificity.
WordNet

● WordNet is a lexical database of the English language that has been widely used in Natural
Language Processing (NLP) and computational linguistics.

● It was created by researchers at Princeton University and provides a structured hierarchy of


words and their relationships, including hypernyms, hyponyms, synonyms, and more.

● WordNet has been influential in various NLP applications and research areas.
Applications of WordNet in NLP

Semantic Similarity and Relatedness:


● WordNet organizes words into synsets (sets of synonymous words) and provides links between
them, such as hypernyms and hyponyms.
● NLP applications leverage these relationships to measure semantic similarity or relatedness
between words.

Sense Disambiguation:
● Word sense disambiguation is the task of determining the correct meaning of a word in a
particular context.
● WordNet can be used to identify different senses of a word and their relationships, helping NLP
systems choose the most appropriate sense based on context.
Applications of WordNet in NLP

Ontology Construction:
● WordNet's hierarchical structure serves as a basis for constructing ontologies and
knowledge graphs.
● These structures are valuable for representing and organizing knowledge in various
domains, enhancing the understanding of concepts and their interconnections.

Text Annotation:
● WordNet annotations are used in linguistic and semantic annotation tasks.
● Researchers and developers use WordNet to annotate texts with information about word
senses, relationships, and other semantic features.
Applications of WordNet in NLP

Word Embeddings:
● Word embeddings capture semantic relationships between words in a continuous vector space.
● WordNet information has been used to improve the quality of word embeddings by
incorporating knowledge about synonyms, hypernyms, and hyponyms.

Lexical Databases:
● WordNet serves as a valuable resource for creating and enhancing lexical databases in NLP.
● Its extensive coverage of English words and their relationships makes it a foundational tool for
building lexical resources.
Word Sense Disambiguation (WSD)

Word Sense Disambiguation (WSD) is a process used in natural language processing to figure
out the correct meaning of a word in a particular context.

Example Sentence: "I went to the bank to deposit some money."

Now, the word "bank" has different meanings. It could mean a financial institution where you keep or
deposit money, or it could mean the side of a river. In this sentence, we need to figure out which
meaning of "bank" is intended.
Evaluation of WSD

The evaluation of WSD requires the following two inputs −

1. A Dictionary

The very first input for evaluation of WSD is dictionary, which is used to specify the senses to be
disambiguated.

1. Test Corpus

Another input required by WSD is the high-annotated test corpus that has the target or correct-senses.
Four conventional methods to WSD

Dictionary-based or Knowledge-based Methods:


● What they use: These methods rely on dictionaries, databases of knowledge, and lexical
information.
● How they work: They don't look at lots of examples (corpora) but instead compare the
meanings of words using definitions in a dictionary. The Lesk method, for example,
measures the overlap between the definitions of different senses of a word in the context of
a sentence or paragraph.
Supervised Methods:
● What they use: Machine learning methods that need training on datasets where words are
already labeled with their correct meanings.
● How they work: They learn from examples. These methods look at lots of text that's already
labeled with the correct meanings of words and use that knowledge to predict meanings in
new, unseen text. Support vector machine and memory-based learning are examples of
successful supervised methods.
Semi-supervised Methods:
● What they use: Both labeled (with meanings) and unlabeled (without meanings) data.
● How they work: They use a small amount of text that's labeled with word meanings and a
larger amount of text that's not labeled. The idea is to bootstrap or gradually improve their
understanding by learning from the labeled data and then applying that knowledge to the
unlabeled data.

Unsupervised Methods:
● What they assume: Similar meanings occur in similar contexts.
● How they work: These methods don't use labeled data or dictionaries. Instead, they try to
figure out the meanings of words by looking at how they are used in similar contexts. They
group words together based on similarities in how they are used. This is useful because it
doesn't rely on manually labeled data, making it more efficient.
Applications of WSD

1. Machine Translation (MT):


i. In translation, some words have different meanings. WSD helps the computer choose
the right meaning for each word to make the translation accurate. For example, if the
word "bank" can mean a place for money or the side of a river, WSD helps the
computer decide which one fits the context.
2. Information Retrieval (IR):
i. IR helps people find information, but sometimes the search terms (queries) are
ambiguous. WSD helps clarify these ambiguous queries, making it easier for the
computer to find the right information. For instance, if someone searches for "apple,"
WSD can help decide if they mean the fruit or the company.
3. Text Mining and Information Extraction (IE):

WSD is important for accurate text analysis. For example, in a system that intelligently
gathers information, WSD can help identify the correct words. In a medical system, it might
be crucial to flag "illegal drugs" rather than "medical drugs."

4. Lexicography:

WSD and lexicography can work together. Modern lexicography, which involves creating
dictionaries based on large collections of texts (corpus), benefits from WSD. WSD provides
rough groupings of word meanings and helps identify contextually significant indicators,
making dictionaries more accurate and reflective of how words are used. They kind of help
each other out in improving the understanding of words and their meanings.
Dictionary based approach

● In a dictionary-based approach, we use a predefined set of words or phrases that are associated
with specific meanings or categories.
● This approach relies on a dictionary or a list of key terms to understand and analyze text.
● Each term in the dictionary is assigned a particular sentiment, topic, or attribute.
Example:

Suppose you have a dictionary of words associated with sentiments:

Positive words: happy, joyful, fantastic

Negative words: sad, disappointed, terrible

If you want to perform sentiment analysis on a sentence like "I feel happy and excited," the
dictionary-based approach would involve checking if the words in the sentence match the positive or
negative words in your dictionary. In this case, "happy" and "excited" match positive words, so the
overall sentiment would be considered positive.
Latent Semantic Analysis

● LSA is a more sophisticated technique that involves extracting hidden relationships and
meanings from a large body of text.
● It works by creating a mathematical representation of the relationships between words based on
the contexts in which they appear.
● It's particularly useful for tasks like document similarity analysis.
Assumptions of LSA:

1. The words which are used in the same context are analogous to each other.
2. The hidden semantic structure of the data is unclear due to the ambiguity of the words
chosen.
Homonymy: Two or more lexical terms with the same spelling
and different meanings.
Polysemy: Two or more terms that have the same spelling and
similar meanings.
Synonymy: Two or more lexical terms with different spellings
and similar meanings.

You might also like