Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

IBM ICE (Innovation Centre for Education)

Welcome to:
Data Warehouse

9.1
Unit objectives IBM ICE (Innovation Centre for Education)
IBM Power Systems

After completing this unit, you should be able to:

• Understand the concept of decision support science

• Gain knowledge on the concept of data warehouse

• Gain on insight into the steps in building a multi dimensional model

• Learn about business reports and queries


Decision support IBM ICE (Innovation Centre for Education)
IBM Power Systems

• IT to help the knowledge worker to make decisions faster and better.

• “What were the sales volumes by region and product category for the last year?”

• “How did the share price of comp. manufacturers correlate with quarterly profits over the past
10 years?”

• “Which orders should we fill to maximize revenues?”


Three-tier decision support systems IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Warehouse database server

• OLAP servers
– Multidimensional OLAP (MOLAP)
– Relational OLAP (ROLAP)

• Clients
– Reporting and query tools
– Data mining tools
– Analysis tools
Exploring and analyzing data IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Exploring and analyzing data


What is a data warehouse? IBM ICE (Innovation Centre for Education)
IBM Power Systems

• What is a data warehouse?


– A decision support database which are sustained independently from the operational database of an
enterprise.
– Assists in data processing by providing a strong stage of past data for standardized checking.
– A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in
support of management decision-making process ”W. H. Inmon.
Self evaluation: Exercise 6 IBM ICE (Innovation Centre for Education)
IBM Power Systems

• To continue with the training, after learning the various steps involved in Business Analytics, it
is instructed to utilize the concepts to perform the following activity.

• You are instructed to do the following activities using different tools and open-source
application.

• Exercise 6: Summary statistics


Data warehouse architecture choices IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Various data warehouse architectures


Enterprise data warehouse IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Enterprise data warehouse architecture


Self evaluation: Exercise 7 IBM ICE (Innovation Centre for Education)
IBM Power Systems

• To continue with the training, after learning the various steps involved in Business Analytics, it
is instructed to utilize the concepts to perform the following activity.

• You are instructed to do the following activities using different tools and open-source
application.

• Exercise 7: Predictive analytics


Independent data mart architecture IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Data models in an independent data mart architecture


Dependent data mart architecture IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Dependent data mart architecture


Self evaluation: Exercise 8 IBM ICE (Innovation Centre for Education)
IBM Power Systems

• To continue with the training, after learning the various steps involved in Business Analytics, it
is instructed to utilize the concepts to perform the following activity.

• You are instructed to do the following activities using different tools and open-source
application.

• Exercise 8: Prescriptive analytics


Data warehouse IBM ICE (Innovation Centre for Education)
IBM Power Systems

• There are 3 types of data warehouse applications:


– Data mining.
– Analytical processing.
– Information processing.

• Data warehouse structure is derived from a data cube:


– Data from tables are built to cubes of multi dimensional view.
– The multi dimensional data model becomes the foundation of any data warehouse.
Self evaluation : Exercise 9 IBM ICE (Innovation Centre for Education)
IBM Power Systems

• To continue with the training, after learning the various steps involved in Business Analytics, it
is instructed to utilize the concepts to perform the following activity.

• You are instructed to do the following activities using different tools and open-source
application.

• Exercise 9: Linear models


Multidimensional data IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Multidimensional data


Conceptual modeling of
data warehouses IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Modeling data warehouses:


– Dimensions
– Measures

Figure: Types of dimensional models


Data warehouse design process IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Data warehouse design process.

• Data warehouse design procedure.

• Bottom-up: Begins with the investigation and prototypes.

• Top-down: The overall planning and design are done.

• From the software engineering viewpoint:


– Spiral: Rapid generation of less duration, higher functionality, and high dynamic systems.
Single-layer architecture IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Single layer architecture


Two-layer architecture IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Two-layer architecture


Three-tier data warehouse architecture IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Three tier data warehouse architecture


Data warehouse development IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Data warehouse development


Multi-tiered architecture IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Multi-tiered architecture


Information pyramid IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Information pyramid


BI reporting tool architectures IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: BI reporting tool architectures


Multidimensional analysis techniques IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Drill-down and roll-up analysis


Data analysis and OLAP IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Cross tabulation of sales


by item-name and color Figure: Relational representation of cross-tabs
OLAP server architectures IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: OLAP server architectures


Data cube IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Data cube


OLTP vs OLAP IBM ICE (Innovation Centre for Education)
IBM Power Systems

Table: OLTP vs. OLAP


Business query IBM ICE (Innovation Centre for Education)
IBM Power Systems

Figure: Business query


Dashboards and scorecards
development IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Dashboards and scorecards development.

• “A dashboard is a visual display of the most important information needed to achieve one or
more objectives, consolidated and arranged on a single screen so that the information can be
monitored at a glance” - Stephen few

• Most dashboards share a set of common features:


– Global navigation feature.
– Interactivity.
– Custom-made interface.
– Embedded content.
– Browser-based capabilities.
Metadata model IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Meta data model is defined as-“it is used to describes the contents and locations of the data
(or data model) in the data warehouse, relationships between the operational databases and
the data warehouse, and the business views of the data in the warehouse as accessible to
the end-user tools.”

• Metadata is used by users to search for the content areas and data definitions.

• A repository is a place where this data is managed and maintained.

• Thus, metadata repository is a place to share metadata.


Automated tasks and events IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Automated tasks and events.

• Primary automation of tasks is:


– BI alerts.
– Integration with other business applications.
– Automated scheduling of reports.
Mobile BI IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Mobile and disconnected BI:


– This is about viewing analytics data, interactive dashboards, and queries even when the mobile
device is not connected to the network, and relies on cache of data in the mobile application.
– This approach has limited capability in terms of availability of data and its interactivity due to concerns
of making the data available on the mobile device itself.
Software development kit (SDK) IBM ICE (Innovation Centre for Education)
IBM Power Systems

• “Software development kit is a comprehensive development environment that enables easy


customization and integration of the BI platform with enterprise business applications.”

• The SDK allows developers to implement BI applications with highly customized, functional,
and powerful web reports that meet all the BI requirements of the organization.

• The SDK provides a platform independent automation interface for working with the BI tool’s
services and components.
Setting up data for BI IBM ICE (Innovation Centre for Education)
IBM Power Systems

• ETL tools and processes are responsible for the extraction of data from one or many source
systems, as they transform data from many different formats into a common format and then
load that data into a data warehouse. (Schink, 2009).

• The extracted data must be deemed central to the business. The ETL tools manipulate and
present the data into information that is then used for managerial decision making.

• ETL solutions are divided into 3 distinct stages that find and convert data from various
sources and insert the resulting product into a data warehouse.
Making BI easy to consume IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Making BI easy to consume.

• From the perspective of the information consumer, the main objective of self-service BI is to
make BI results easier to consume and improve.

• This is achieved by publishing the results of BI through enhanced user interfaces.

• The interfaces allow each category of information worker to be familiar with and comfortable
using BI.

• The types are:


– Work group portal
– Web application
– Desktop application
Checkpoint (1 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Multiple choice questions:

1. __________ is a subject-oriented, integrated, time-variant, nonvolatile collection of data in


support of management decisions.
a) Data Mining
b) Data Warehousing
c) Web Mining
d) Text Mining

2. The data Warehouse is__________.


a) read only
b) write only
c) read write only
d) none

3. Expansion for DSS in DW is__________.


a) Decision Support system
b) Decision Single System
c) Data Storable System
d) Data Support System
Checkpoint solutions (1 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Multiple choice questions:

1. __________ is a subject-oriented, integrated, time-variant, nonvolatile collection of data in


support of management decisions.
a) Data Mining
b) Data Warehousing
c) Web Mining
d) Text Mining

2. The data Warehouse is__________.


a) Read only
b) Write only
c) Read write only
d) None

3. Expansion for DSS in DW is__________.


a) Decision Support system
b) Decision Single System
c) Data Storable System
d) Data Support System
Checkpoint (2 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Fill in the blanks:

1. A view is nothing but a _____ which takes the output of the query and it can be used in
place of tables.
2. A ______ is nothing but an indirect access to the table data by storing the results of a query
in a separate schema.
3. ____ is abbreviated as Operational Data Store and it is a repository of real time operational
data rather than long term trend data.
4. _____ is abbreviated as Very Large Database and its size is set to be more than one
terabyte database

True or False:

1. The data is stored, retrieved & updated in OLTP. True/False


2. Metadata describes the data contained in the data warehouse. True/False
3. Data warehouse database servers is the heart of the warehouse. True/False
Checkpoint solutions (2 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Fill in the blanks:

1. A view is nothing but a virtual table which takes the output of the query and it can be used in
place of tables.
2. A materialized view is nothing but an indirect access to the table data by storing the results
of a query in a separate schema.
3. ODS is abbreviated as Operational Data Store and it is a repository of real time operational
data rather than long term trend data.
4. VLDB is abbreviated as Very Large Database and its size is set to be more than one terabyte
database

True or False:

1. The data is stored, retrieved & updated in OLTP. True


2. Metadata describes the data contained in the data warehouse. True
3. Data warehouse database servers is the heart of the warehouse. True
Question bank IBM ICE (Innovation Centre for Education)
IBM Power Systems

Two marks question:

1. What is Business Intelligence?


2. What are the stages of data warehousing?
3. What is Data Mining?
4. What is OLTP?

Four marks question:

1. What is OLAP?
2. What is the difference between OLTP and OLAP?
3. What needs to be done when the database is shutdown?
4. What is Execution Plan?

Eight marks question:

1. What are the approaches used by optimizer during execution plan?


2. What is the difference between metadata and data dictionary?
Unit summary IBM ICE (Innovation Centre for Education)
IBM Power Systems

Having completed this unit, you should be able to:

• Understand the concept of decision support science

• Gain knowledge on the concept of data warehouse

• Gain on insight into the steps in building a multi dimensional model

• Learn about business reports and queries

You might also like