Welcome

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Natural Language

Processing
Art integrated project

-Vaibhav Siwach
WHAT IS NLP?
Natural Language Processing (NLP) is the sub-field of AI that focuses on the
ability of a computer to understand human language (command) as spoken
or written and to give an output by processing it, is called Natural Language
Processing (NLP). It is a component of Artificial Intelligence.
NLP drives computer programs that translate text from one language to
another, respond to spoken commands, and summarize large volumes of
text rapidly—even in real time. There’s a good chance you’ve interacted with
NLP in the form of voice-operated GPS systems, digital assistants, speech-
to-text dictation software, customer service chatbots, and other consumer
conveniences.
Applications Of NLP
Automatic Summarization – Automatic summarization is useful for gathering data from
social media and other online sources, as well as for summarizing the meaning of
documents and other written materials.

Sentiment Analysis – To better comprehend what internet users are saying about a
company’s goods and services, businesses use natural language processing tools like
sentiment analysis to understand the customer requirement.

Text classification – Text classification enables you to classify a document and organise it
to make it easier to find the information you need or to carry out certain tasks. Spam
screening in email is one example of how text categorization is used.

Virtual Assistants – These days, digital assistants like Google Assistant, Cortana, Siri, and
Alexa play a significant role in our lives. Not only can we communicate with them, but they
can also facilitate our life.
Human Language VS
Computer Language
Humans need language to communicate, which we constantly
process. Our brain continuously processes the sounds it hears around
us and works to make sense of them. Our brain continuously processes
and stores everything, even as the teacher is delivering the lesson in
the classroom.
The Computer Language is understood by the computer, on the other
hand. All input must be transformed to numbers before being sent to
the machine. And if a single error is made while typing, the machine
throws an error and skips over that area. Machines only use extremely
simple and elementary forms of communication.
Data
Processing
Data Processing is a method of manipulation of data. It means
the conversion of raw data into meaningful and machine-
readable content. It basically is a process of converting raw data
into meaningful information.
Since human languages are complex, we need to first of all
simplify them in order to make sure that the understanding
becomes possible. Text Normalisation helps in cleaning up the
textual data in such a way that it comes down to a level where its
complexity is lower than the actual data. Let us go through Text
Normalisation in detail.
Steps Involved
1. Text Normalisation-The process of converting a text into a canonical (standard)
form is known as text normalisation. For instance, the canonical form of the word
“good” can be created from the words “gooood” and “gud.”
2. Sentence Segmentation-Under sentence segmentation, the whole corpus is
divided into sentences.
3. Tokenisation-Sentences are first broken into segments, and then each segment
is further divided into tokens. Any word, number, or special character that
appears in a sentence is referred to as a token.
4. Removing Stopwords-The tokens which are not necessary are removed from the
token list.Stopwords are words that are used frequently in a corpus but provide
nothing useful.
5. Converting text to a common case
6. Stemming Or Lemmatization-The remaining words are boiled down to their root
words in this step
Bag Of Words
A bag-of-words is a textual illustration that shows where words
appear in a document.

Steps For Implementing bag of words:-


e
1. Text Normalisation: Collect data and pre-process it
2. Create Dictionary: Make a list of all the unique words
occurring in the corpus. (Vocabulary)
3. Create document vectors: For each document in the corpus,
find out how many times the word from the unique list of words
has occurred.
4. Create document vectors for all the documents.
Thank You

Bibliography
https://www.ibm.com
https://aiforkids.in
https://cbseskilleducation.com

You might also like