Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 8

BIG DATA ANALYTICS (2017 REGULATION)

UNIT – 1 INTRODUCTION TO BIG DATA

Evolution of Big data - Best Practices for Big data


Analytics - Big data characteristics - Validating - The
Promotion of the Value of Big Data

Big Data Use Cases- Characteristics of Big Data


Applications - Perception and Quantification of Value
-Understanding Big Data Storage

A General Overview of High-Performance Architecture


- HDFS - MapReduce and YARN - Map Reduce
Programming Model

1
UNIT - I BIG DATA ANALYTICS (2017WEB SERVERS,
REGULATION) SERVLETS
Data:

Data/Information:
 The word Data and Information may look similar.

Different forms of data:


 Letter, Word, Number, Image, Sound etc…

Data Information
Data is raw & unorganized (After processing the data)
form. Information is organized, structured (or) presented in a
given context so as to make it useful.

- Data doesn’t depends on information but information depends on data.


- Data is not specific and does not carry any meaning, Information is specific and meaningful.
UNIT - I BIG DATA ANALYTICS (2017WEB SERVERS,
REGULATION) SERVLETS
For example, consider the following :

Data:
Saranya324917628
Rajkumar476193248
Kamal548429344
Gopal551742186
Latha409723145

Information:

KGISL Institute of Technology

Course: IT Semester: 4

Student Name ID
Saranya 324917628
Rajkumar 476193248
Kamal 548429344
Gopal 551742186
Latha 409723145
BIG DATA ANALYTICS (2017 REGULATION)

 Data can be Qualitative (or) Quantitative

DATA

Qualitative Quantitative
Ex: “My name is hari”

Discrete [Whole number] Continuous [Within range]


Ex: 5 ,6, 7 Ex: 3.25, 5.68
BIG DATA ANALYTICS (2017 REGULATION)

Classification of Data:

Structured data Unstructured data

This is the data which is in an organized form. Unstructured data is data that is in a more ambiguous
Data stored in relational database, can be shown as format. It may (or may not) be in a well-defined
two-dimensional table with rows and columns. syntactical format. which is impossible or very hard to
Example: (DBMS, Spreadsheet etc.) be filled in two-dimensional tables.
Address details, Financial data such as accounting Example:
transactions. Image, Audio, Video, Word documents, Social media
content, etc.
BIG DATA ANALYTICS (2017 REGULATION)

Semi Structured Data:

The structured data which does not conform with formal structure of data models in context of relationships is
semi-structured data.

Examples : XML, JSON, some NoSQL databases like MongoDB which store the data natively in JSON.

Working with Structured Data:


1. Insert/Update/ Delete
2. Security – Encryption/Decryption.
3. Indexing – Speed up the data retrieval operations.

4. Scalability – Storage and processing capability.


5. Transaction Processing - (ACID – Properties)
 Atomicity – A transaction is atomic.
 Consistency – Data moves from one consistent state to another.
 Isolation – The resource allocation to the transaction.
 Durability – All changes made to the database.
BIG DATA ANALYTICS (2017 REGULATION)

How to deal with unstructured data:

The following techniques used to find pattern:

Data Mining: It is the analysis step of “Knowledge discovery in Databases” process.

Popular mining algorithm:


 Association rule mining – Also called as “Market basket analysis” (or) “Affinity analysis”

 Regression analysis –It helps to predict the relationship between two variables .

 Collaborative filtering – It is about predicting a user preferences.

Text analytics or Text mining: Is the process of collecting meaningful information from text.

It includes tasks such as text categorization, text clustering, Sentiment analysis, Concept/Entity extraction etc.
Natural Language Processing: It is about enabling computers to understand human or natural language input.

Noisy text analytics: It is the process of extracting structured or semi-structured information from noisy
unstructured data such as chats, blogs, wikis, emails, message boards, text messages etc.
BIG DATA ANALYTICS (2017 REGULATION)

DATA ANALYSIS

VS

DATA ANALYTICS or DATA SCIENCE

Both are Different or Same?

Answer: Different why?

Data Analysis  What's happening.

Data Analytics  What is going to happen (Predict Feature)

Without analysis No data analytics

You might also like