Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Natural Language Processing (CS525PE)

B.TECH III YEAR SEM-I


COURSE PLANNER

I. COURSE AIM:
The aim of this course is to have a comprehensive perspective of inclusive learning,
ability to learn and implement Natural Language Processing.
II. Course Objectives
1. Introduce to some of the problems and solutions of NLP and their relation to
linguistics and statistics.
III. COURSE OUTCOME:
S.N Bloom’s
o Taxonomy Level
Description
Able to Show sensitivity to linguistic phenomena and an ability to model
1 L1: REMEMBERING
them with formal grammars.
Understand and carry out proper experimental methodology for training
2 and evaluating empirical NLP systems L2:UNDERSTANDING
Able to determine probabilities, construct statistical models over strings
3 and trees, and estimate parameters using supervised and unsupervised L5: EVALUATING
training methods.

4 Able to design, implement, and analyze NLP algorithms L6: CREATE

5 Able to design different language modeling Techniques. L6: CREATE

IV.HOW PROGRAM OUTCOMES ARE ASSESSED:

Proficiency
Program Outcomes (PO) Level assessed
by
PO1Engineering knowledge: Apply the knowledge of
mathematics, science, engineering fundamentals, and an
engineering specialization to the solution of complex 3 Assignments
engineering problems related to Computer Science and
Engineering.
PO2 Problem analysis: Identify, formulate, review research
literature, and analyze complex engineering Assignments,
problems related to Computer Science and Tutorials,
2
Engineering and reaching substantiated conclusions Mock
using first principles of mathematics, natural Tests
sciences, and engineering sciences.
PO3 Design/development of solutions: Design solutions for 2.5 Assignments,
CSE III Yr- I SEM 100
Proficiency
Program Outcomes (PO) Level assessed
by
complex engineering problems related to Computer Tutorials,
Science and Engineering and design system Mock
components or processes that meet the specified Tests
needs with appropriate consideration for the public
health and safety, and the cultural, societal, and
environmental considerations.
PO4 Conduct investigations of complex problems: Use
research-based knowledge and research methods
including design of experiments, analysis and 2.5 Assignments
interpretation of data, and synthesis of the
information to provide valid conclusions.
PO5 Modern tool usage: Create, select, and apply
Assignments,
appropriate techniques, resources, and modern
Tutorials,
engineering and IT tools including prediction and 2
Mock
modeling to complex engineering activities with an
Tests
understanding of the limitations.
PO6 The engineer and society: Apply reasoning informed
by the contextual knowledge to assess societal, Assignments,
health, safety, legal and cultural issues and the Tutorials,
3
consequent responsibilities relevant to the Computer Mock
Science and Engineering professional engineering Tests
practice.
PO7 Environment and sustainability: Understand the
impact of the Computer Science and Engineering
professional engineering solutions in societal and
1 Assignments
environmental contexts, and demonstrate the
knowledge of, and need for sustainable
development.
PO8 Ethics: Apply ethical principles and commit to
professional ethics and responsibilities and norms of - --
the engineering practice.
PO9 Assignments,
Individual and team work: Function effectively as an
Tutorials,
individual, and as a member or leader in diverse -
Mock
teams, and in multidisciplinary settings.
Tests
PO10 Communication: Communicate effectively on complex
engineering activities with the engineering
community and with society at large, such as, being
- --
able to comprehend and write effective reports and
design documentation, make effective presentations,
and give and receive clear instructions.
PO11 Project management and finance: Demonstrate
Assignments,
knowledge and understanding of the engineering
Tutorials,
and management principles and apply these to one‟s 3
Mock
own work, as a member and leader in a team, to
Tests
manage projects and in multidisciplinary

CSE III Yr- I SEM 101


Proficiency
Program Outcomes (PO) Level assessed
by
environments.
PO12 Life-long learning: Recognize the need for, and have
the preparation and ability to engage in independent Assignments,
2
and life-long learning in the broadest context of Tutorials
technological change.

1: Slight (Low) 2: Moderate (Medium) 3: Substantial - : None


(High)

Program Specific Outcomes (PSO) Level Proficiency


assessed by
PSO1 Foundation of mathematical concepts: Lectures,
To use mathematical Methodologies to crack problem Assignments,
using suitable mathematical analysis, data structure and 2.8
Tutorials, Mock
suitable algorithm. Tests
PSO2 Foundation of Computer System:
Lectures,
The ability to interpret the fundamentalconcepts and
Assignments,
methodology of computer systems. Students can 2
Tutorials, Mock
understand the functionality of hardware and software
Tests
aspects of computer systems.
PSO3 Foundations of Software development:
The ability to grasp the software development lifecycle
and methodologies of software systems. Possess Lectures,
competent skills and knowledge of software design Assignments,
2.4
process. Familiarity and Tutorials, Mock
practical proficiency with a broad area of programming Tests
concepts and provide new ideas and innovations towards
research.

1: Slight (Low) 2: Moderate (Medium) 3: Substantial (High) None

JNTU SYLLABUS
UNIT - I
Finding the Structure of Words:
Words and Their Components, Issues and Challenges,Morphological Models
Finding the Structure of Documents: Introduction, Methods, Complexity of the
Approaches, Performances of the Approaches
UNIT - II
Syntax Analysis: Parsing Natural Language, Treebanks: A Data-Driven Approach
to Syntax, Representation of Syntactic Structure, Parsing Algorithms, Models for
Ambiguity Resolution in Parsing, Multilingual Issues
UNIT - III
Semantic Parsing: Introduction, Semantic Interpretation, System Paradigms, Word
Sense Systems, Software.
UNIT - IV

CSE III Yr- I SEM 102


Predicate-Argument Structure, Meaning Representation Systems, Software.
UNIT - V
Discourse Processing: Cohension, Reference Resolution, Discourse Cohension
and Structure
Language Modeling: Introduction, N-Gram Models, Language Model Evaluation, Parameter
Estimation, Language Model Adaptation, Types of Language Models, Language-Specific
Modeling Problems, Multilingual and Crosslingual Language Modeling
TEXT BOOKS:
1. Multilingual natural Language Processing Applications: From Theory to Practice – Daniel M.
Bikel and Imed Zitouni, Pearson Publication
2. Natural Language Processing and Information Retrieval: Tanvier Siddiqui, U.S. Tiwary
REFERENCE:
1. Speech and Natural Language Processing - Daniel Jurafsky & James H Martin, Pearson
Publications

LESSON PLAN-COURSE SCHEDULE:

S.N Wee Course Learning Teaching Text


Topics
o k Outcomes Methodologies Book
Unit – 1

Object Based
1 Understand OBE
Education(OBE)Orient
ation
Finding the Structure Understand the
2 of Words: Words and Structure of Word and
Their Components components
Understand the
3 Words and Their Structure of Word and
Components components
Understand the issues
4 and challenges in words
1 Issues and Challenges,
Analyze the
5 morphological Models
Morphological Models Black Board & PPT T1
Analyze the
morphological Models
6
Morphological Models
Finding the Structure Understand the
7 of Documents: Documents
Introduction,
8 2 Methods Understand Methods
9 Methods Understand Methods
Complexity of the Analyze the Models
10
Approaches complexity
Complexity of the Analyze the Models
11
3 Approaches complexity

CSE III Yr- I SEM 103


Remember the
12 Performances of the performance of the
Approaches Models
13 Mock Test #1
Unit – 2
Syntax Analysis:
14 Parsing Natural Understand the Syntax
Language analysis
Parsing Natural Define the Parsing of
15 Natural Language
Language
16 Bridge Class #1
Treebanks: A Data- Analyze the Tree Banks
17 Driven Approach to approach
4 Syntax
Treebanks: A Data- Analyze the Tree Banks
18 Driven Approach to approach
Syntax
Understand the
19 Representation of representation of
Syntactic Structure Syntactic Structure Black Board & PPT T1
20 Bridge Class #2
Understand the Parsing
21 Algorithms
5 Parsing Algorithms
Understand the Parsing
22 Algorithms
Parsing Algorithms
Models for Ambiguity Analyze the Ambiguity
Resolution in Resolution in
23
Parsing,Multilingual Parsing,Multilingual
Issues Issues
Models for Ambiguity Analyze the Ambiguity
Resolution in Resolution in
24
Parsing,Multilingual Parsing,Multilingual
6 Issues Issues
Unit – 3
25 Semantic Parsing: Understand the
26 Introduction semantic parsing
27 Analyze the semantic
28 7 Semantic Interpretation Interpretation
Black Board & PPT
29 Analyze the semantic T1
30 Semantic Interpretation Interpretation

** NLP Programming Design the coding part


31 for the NLP
Using Python
32 8 Bridge Class #3
33 Understand the System
34 System Paradigms Paradigms
Black Board & PPT T1
35 Understand the System
36 9 System Paradigms Paradigms

CSE III Yr- I SEM 104


Understand the Word
37 sense Systems
Word Sense Systems
Understand the Word
38 sense Systems
Word Sense Systems
Software related to Understand the software
39 which used in NLP
10 word sense
Unit – 4
Understand the
40
Predicate predicate logic
Understand the
41 Argument structure
Argument Structure details
Argument Structure Understand the
42 Argument structure
details
Argument Structure Understand the
43 Argument structure
11 details
44 Seminars by students
Meaning Analyze the Meaning
45 Representation representation systems
Systems, Black Board & PPT T1 & T2
Meaning Analyze the Meaning
46 Representation Systems representation systems
Meaning Analyze the Meaning
47 Representation Systems representation systems
12
Software for Understand the
48 representation software
mechanism
Understand The tool
49
** NDLK Tool Kit kits
Understand The tool
50
** NDLK Tool Kit kits
51 13 Bridge Class #4
52 14 Mock Test #2
Unit – 5
Discourse Processing: Understand the
53 Cohension, Reference Cohension and reference
Resolution resolution
14
54 Understand the
Discourse Cohension Discourse Cohension
55 and Structure and structure
Understand the Black Board & PPT T1 & T2
56 Language Modeling:
Introduction Language Modeling
Analyze the N-Gram
57 15
N-Gram Models Models
Language Model Determine the language
58 model evaluation
Evaluation,

CSE III Yr- I SEM 105


Analyze the parameter
59 Estimation
Parameter Estimation
Analyze the Language
Language Model Model Adaptation,
60
Adaptation, Types of Types of Language
Language Models, Models,
Language-Specific Illustrate the Language-
61 16 Specific Modeling
Modeling
Problems, Multilingual Problems, Multilingual
62 and Crosslingual and Crosslingual
Language Modeling Language Modeling
63 Bridge Class #5

IX.MAPPING COURSE OUTCOMES LEADING TO THE ACHIEVEMENT


OF PROGRAM OUTCOMES AND PROGRAM SPECIFIC OUTCOMES:

Program
Outcomes

Specific
Program Outcomes (PO)
Course

Outcomes
(PSO)
PSO
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2
3

CO1 - 2 3 3 3 - - - - - 3 2 3 2 3
CO2 - 2 2 2 1 - - - - - 3 2 3 2 2
CO3 - 2 3 3 3 3 - - - - 3 2 3 2 2
CO4 - - 2 2 1 - - - - - 3 2 3 2 2
CO5 3 - - - 2 3 1 - - - - - 2 2 3
AV
3 2 2.5 2.5 2 3 1 - - - 3 2 2.8 2 2.4
G

1: Slight 2: Moderate
(Low) (Medium) 3: Substantial (High) - : None

QUESTION BANK: (JNTUH)


UNIT-I
I. Short Answer Questions-

Blooms Course
S.No Question
Taxonomy Outcome
Level
List the methods of Word components
1 L1 1
2 Define NLP L1 1
3 What is Natural Language Processing? Discuss L1 1
with some applications.
4 Analyze the usage of feature structures in NLP. L1 2
5 What do you meant by NLP algorithm L1 2
CSE III Yr- I SEM 106
II.Long Answer Questions-

Blooms Course
S.No Question
Taxonomy Outco
Level me
Design a finite state transducer with E-insertion
1 orthographic rule that parses L5 2
from surface level “foxes” to lexical level
“fox+N+PL” using FST.
2 Analyse how statistical methods can be used in L4 3
machine translation
3 Explain the complexity approaches L2 3

4 Explain the Performances analysis L1 2


5 Explain the structure documents L1 1

UNIT-2
I.Short Answer Questions-

Blooms Course
S.No Question
Taxonomy Outcom
Level e
Define Parsing
1 L1 2
2 What is Treebanlk? L1 3

3 Define Syntax L2 3

4 List the parsing algorithms L1 3


5 Define Multilingual L2 3

II.Long Answer Questions-

Blooms Cours
S.No Question
Taxonomy e
Level Outco
me
Explain the parsing of NLP
1 L1 2
2 Explain the Tree Bank method with example L2 3

3 Explain data –driven mechanism L3 3

4 Explain the models of ambiguity resolution L1 3


5 Explain the Multilingual issues L2 3

CSE III Yr- I SEM 107


UNIT-3
I.Short Answer Questions-

Blooms Cours
S.No Question
Taxonomy e
Level Outco
me
Define semantic
1 L1 2
2 List the semantic rules L2 3

3 Define system paradigm L1 3

4 What is word sense system L2 3


II.Long Answer Questions-

Blooms Cours
S.No Question
Taxonomy e
Level Outco
me
Explain in detail about semantic interpretation.
1 L2 5
2 Explain System paradigms L1 5

3 Explain the methods of word sense systems L2 5

4 Explain the software‟s associated with sematic L4 5


interpretation

UNIT-4
I.Short Answer Questions
Blooms Course
S.No Question
Taxonomy Outcome
Level
Define Predicate Logic
1 L1 5
2 Give example for predicate logic L1 5

3 Define argument structure L2 5

4 Define structure management L2 5


5 Define representation in NLP L3 5
II.Long Answer Questions-

Blooms Course
S.No Question
Taxonomy Outcome
Level

CSE III Yr- I SEM 108


Explain in detail about predicate logic with
1 examples. L1 5
2 Explain in detail about argument structure in L2 5
NLP
3 Explain in detail about meaning representation L2 5
system
4 List and explain the meaning representation L2 5
UNIT-5
I. Short Answer Questions-

Blooms Course
S.No Question
Taxonomy Outcome
Level
Define cohension
1 L1 5
2 Define reference resolution L2 5

3 Define discourse cohension L1 5


4 Define modeling L2 5
5 What do you meant by crosslingual L3 5

. II. Long Answer Questions-


Blooms Course
S.No Question
Taxonomy Outcome
Level
Explain in detail about reference resolution
1 L1 5
2 Explain in detail about discourse of cohesion L1 5

3 Explain in detail about N- Gram Models L3 5

4 Explain in detail about language specific models L2 5


5 Discuss about language model adaptation L4 5

TEXT BOOKS:
1. Multilingual natural Language Processing Applications: From Theory to Practice – Daniel M.
Bikel and Imed Zitouni, Pearson Publication
2. Natural Language Processing and Information Retrieval: Tanvier Siddiqui, U.S. Tiwary
REFERENCE:
1. Speech and Natural Language Processing - Daniel Jurafsky & James H Martin, Pearson
Publications
MCQ Questions
Unit – 1
1. What is the field of Natural Language Processing (NLP)?
a) Computer Science
b) Artificial Intelligence
c) Linguistics
CSE III Yr- I SEM 109
d) All of the mentioned
Answer: d
Explanation: None.
2. NLP is concerned with the interactions between computers and human (natural) languages.
a) True
b) False
Answer: a
Explanation: NLP has its focus on understanding the human spoken/written language and
converts that interpretation into machine understandable language.
3. What is the main challenge/s of NLP?
a) Handling Ambiguity of Sentences
b) Handling Tokenization
c) Handling POS-Tagging
d) All of the mentioned
Answer: a
Explanation: There are enormous ambiguity exists when processing natural language.
4. Modern NLP algorithms are based on machine learning, especially statistical machine
learning.
a) True
b) False
View Answer
Answer: a
Explanation: None.
5. Choose form the following areas where NLP can be useful.
a) Automatic Text Summarization
b) Automatic Question-Answering Systems
c) Information Retrieval
d) All of the mentioned
Answer: d
Explanation: None.

FILL IN THE BLANKS:


6. Which includes major tasks of NLP? Automatic Summarization
7. What is Coreference Resolution?
Given a sentence or larger chunk of text, determine which words (“mentions”) refer to the
same objects (“entities”)
8. What is Machine Translation?Converts one human language to another
9. The more general task of coreference resolution also includes identifying so-called
“bridging relationships” involving referring expressions.
10. What is Morphological Segmentation?
Separate words into individual morphemes and identify the class of the morphemes

UNIT-2
MULTIPLE CHOICE QUESTIONS:
1. Select a Machine Independent phase of the compiler
a) Syntax Analysis
b) Intermediate Code generation
c) Lexical Analysis

CSE III Yr- I SEM 110


d) All of the mentioned
View Answer
Answer: d
Explanation: All of them work independent of a machine.
advertisement
2. A system program that combines the separately compiled modules of a program into a form
suitable for execution?
a) Assembler
b) Compiler
c) Linking Loader
d) Interpreter
View Answer
Answer: c
Explanation: A loader which combines the functions of a relocating loader with the ability to
combine a number of program segments that have been independently compiled.
3. Which of the following system software resides in the main memory always
a) Text Editor
b) Assembler
c) Linker
d) Loader
View Answer
Answer: d
Explanation: Loader is used to loading programs.
4. Output file of Lex is _____ the input file is Myfile?
a) Myfile.e
b) Myfile.yy.c
c) Myfile.lex
d) Myfile.obj
View Answer
Answer: b
Explanation: This Produce the filr “myfile.yy.c” which we can then compile with g++.
advertisement
5. Type checking is normally done during?
a) Lexical Analysis
b) Syntax Analysis
c) Syntax Directed Translation
d) Code generation
View Answer
Answer: c
Explanation: It is the function of Syntax directed translation.
FILL IN THE BLANKS:
6. Suppose One of the Operand is String and other is Integer then it does not throw error as it
only checks whether there are two operands associated with „+‟ or not .
7. In Short Syntax Analysis Generates Parse Tree
8. By whom is the symbol table created?Compiler
9. What does a Syntactic Analyser do?Create parse tree
10. Semantic Analyser is used for?Generating Object code & Maintaining symbol table

CSE III Yr- I SEM 111


UNIT-3

1. Which of the following is the fastest logic ?


a) TTL
b) ECL
c) CMOS
d) LSI
View Answer
Answer: b
Explanation: In electronics, emitter-coupled logic (ECL) is a high-speed integrated circuit.
advertisement

2. A bottom up parser generates


a) Right most derivation
b) Rightmost derivation in reverse
c) Leftmost derivation
d) Leftmost derivation in reverse
View Answer
Answer: b
Explanation: This corresponds to starting at the leaves of the parse tree also known as shift-
reduce parsing.

3. A grammar that produces more than one parse tree for some sentence is called
a) Ambiguous
b) Unambiguous
c) Regular
d) None of the mentioned
View Answer
Answer: a
Explanation: ambiguous grammar has more than one parse tree.

4. An optimizer Compiler
a) Is optimized to occupy less space
b) Both of the mentioned
c) Optimize the code
d) None of the mentioned
View Answer
Answer: d
Explanation: In computing, an optimizing compiler is a compiler that tries to minimize or
maximize some attributes of an executable computer program.
advertisement

5. The linker
a) Is similar to interpreter
b) Uses source code as its input
c) I s required to create a load module
d) None of the mentioned
View Answer
CSE III Yr- I SEM 112
Answer: c
Explanation: It is a program that takes one or more object files generated by a compiler and
combines them into a single executable file, library file, or another object file.

FILL IN THE BLANKS:

6. A latch is constructed using two cross coupled NAND gates


7. Pee Hole optimization Constant folding
8. The optimization which avoids test at every iteration is Loop unrolling
9. Scissoring enables A part of data to be displayed
10. Shift reduce parsers are Bottom Up parser

1. Given a stream of text, Named Entity Recognition determines which pronoun maps to which
noun.
a) False
b) True
Answer: a
Explanation: Given a stream of text, Named Entity Recognition determines which items in
the text maps to proper names.
2. Natural Language generation is the main task of Natural language processing.
a) True
b) False
Answer: a
Explanation: Natural Language Generation is to Convert information from computer
databases into readable human language.
3. OCR (Optical Character Recognition) uses NLP.
a) True
b) False
Answer: a
Explanation: Given an image representing printed text, determines the corresponding text.
4. Parts-of-Speech tagging determines ___________
a) part-of-speech for each word dynamically as per meaning of the sentence
b) part-of-speech for each word dynamically as per sentence structure
c) all part-of-speech for a specific word given as input
d) all of the mentioned
Answer: d
Explanation: A Bayesian network provides a complete description of the domain.
5. Parsing determines Parse Trees (Grammatical Analysis) for a given sentence.
a) True
b) False
Answer: a
Explanation: Determine the parse tree (grammatical analysis) of a given sentence. The
grammar for natural languages is ambiguous and typical sentences have multiple possible
analyses. In fact, perhaps surprisingly, for a typical sentence there may be thousands of
potential parses (most of which will seem completely nonsensical to a human).

CSE III Yr- I SEM 113


UNIT-4
MULTIPLE CHOICE QUESTIONS:
1. IR (information Retrieval) and IE (Information Extraction) are the two same thing.
a) True
b) False
Answer: b
Explanation: Information retrieval (IR) – This is concerned with storing, searching and
retrieving information. It is a separate field within computer science (closer to databases), but
IR relies on some NLP methods (for example, stemming). Some current research and
applications seek to bridge the gap between IR and NLP.
Information extraction (IE) – This is concerned in general with the extraction of semantic
information from text. This covers tasks such as named entity recognition, Coreference
resolution, relationship extraction, etc.
2. Many words have more than one meaning; we have to select the meaning which makes the
most sense in context. This can be resolved by ____________
a) Fuzzy Logic
b) Word Sense Disambiguation
c) Shallow Semantic Analysis
d) All of the mentioned
Answer: b
Explanation: Shallow Semantic Analysis doesn‟t cover word sense disambiguation.
3. Given a sound clip of a person or people speaking, determine the textual representation of the
speech.
a) Text-to-speech
b) Speech-to-text
c) All of the mentioned
d) None of the mentioned
Answer: b
Explanation: NLP is required to linguistic analysis.
4. Speech Segmentation is a subtask of Speech Recognition.
a) True
b) False
Answer: a
Explanation: None.
5. In linguistic morphology _____________ is the process for reducing inflected words to their
root form.
a) Rooting
b) Stemming
c) Text-Proofing
d) Both Rooting & Stemming
Answer: b
FILL IN THE BLANKS:

6. Which of these is also known as look-head LR parser? LLR


7. What is the similarity between LR, LALR and SLR? Use same algorithm, but different
parsing table
8. An LR-parser can detect a syntactic error as soon as It is possible to do so a left-to-right
scan of the input
9. Which of these is true about LR parsing ?
CSE III Yr- I SEM 114
Is most general non-backtracking shift-reduce parsing and It is still efficient
10. If a state does not know whether it will make a shift operation or reduction for a terminal
is called Shift/reduce conflict

UNIT-5

MULTIPLE CHOICE QUESTIONS:


1. NLP stands for Natural Language Processing.
a. True
b.False

View Answer
true

2. NLP is concerned with the interactions between computers and human (natural) languages.
a.yes
b.no

View Answer
Yes

3. The following areas where NLP can be useful -

 Automatic Text Summarization


 Information Retrieval
 Automatic Question-Answering Systems
 All of the Above
View Answer
All of the above

4. Machine Translation is that converts -

 Human language to machine language


 One human language to another
 Any human language to English
 Machine language to human language
View Answer
One human language to another

5. Which of the following is the field of Natural Language Processing (NLP)?

 Computer Science
 Artificial Intelligence

CSE III Yr- I SEM 115


 Computational linguistics
 All of the above
View Answer
All of the above

FILL IN THE BLANKS:


6. What is Natural Language Processing good for? Summarize blocks of text

7. You can build a machine learning RSS reader in less than 30-minutes using - ScrapeRSS

8. Natural Language Processing (NLP) is the field of Computer Science

9. NLP is concerned with the interactions between computers and human (natural) languages.

10. One of the main challenge/s of NLP Is Handling Ambiguity of Sentences

JOURNALS:1. Natural Language Processing Research, ISSN: 2666 – 0512


2. Journal of Information : Special Issues on NLP, ISSN : 2078 - 2489

CSE III Yr- I SEM 116

You might also like