Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

Business

Intelligence
Chapter 1 & 2
Business intelligence (BI) is a technology-driven process for
analyzing data and presenting useful information to help
executives, managers and other end users make informed business
decisions.
The potential benefits of using BI tools include accelerating and
improving decision-making, optimizing internal business
processes, increasing operational efficiency, driving new revenues
and gaining competitive advantage over business rivals.
BI systems can also help companies identify market trends and
spot business problems that need to be addressed. In short, BI
technologies allow a business to view their operations, past,
present and future.
BI technologies handle large amounts of data to help identify,
develop and otherwise create new strategic business
opportunities. Identifying new opportunities and implementing an
effective strategy based on insights can provide businesses with
competitive market advantage and long-term profitability.
BI is most effective when it combines data derived from the
market in which a company operates (external data) with data
from company sources internal to the business such as financial
and operations data (internal data).

When combined, external and internal data can provide a


complete picture which, in effect, creates an “intelligence” that
cannot be derived from any singular set of data. Business
intelligence tools empower organizations to gain insight into new
markets, to assess demand and suitability of products and services
for different market segments and to gauge the impact of
marketing efforts.
Figure 1: Inputs to Business Intelligence Systems
WHAT DOES BI DO?
BI assists in strategic and operational decision making. A Gartner survey
ranked the strategic use of BI in the following order:
1. Corporate performance management
2. Optimizing customer relations, monitoring business activity, and
traditional decision support
3. Packaged standalone BI applications for specific operations or
strategies
4. Management reporting of business intelligence
Figure 2: BI Relation to Other Information Systems.
BI converts data into useful information and, through human analysis,
into knowledge. Some of the tasks performed by BI are:

• Creating forecasts based on historical data, past and current


performance, and estimates of the direction in which the future will go.
• “What if” analysis of the impacts of changes and alternative
scenarios.
• Ad hoc access to the data to answer specific, non-routine questions.
• Strategic insight
Other ways a business can use BI to improve performance include:

• Business Process Management- Performance


metrics and benchmarking inform business leaders of progress
towards business goals. BI tools can help a business boost internal
productivity by focusing their efforts on what is important.

• Decision Making- BI analytics such as data mining and statistical


analysis quantify processes for a business to make the best decisions.
BI can help a business identify areas to cut costs or how to distribute
budget allocations.
• Business Planning - Businesses can use BI data to develop both short
term goals and long term strategy. Businesses can gain insight into
their customers and market trends, allowing them to make decisions
about current and future operations, products, goods or services.

• Collaboration - BI can facilitate collaboration both inside and outside


the business by enabling data sharing and electronic data
interchange. Many businesses use BI tools to communicate with
suppliers, reducing lead times and inventory levels. By sharing data
among partners, each business has up-to-the-minute information on
everything from delivery times to price changes.
Introduction to How Businesses Use
Information
Traditionally we think about value in business in terms of assets—
property, plants, equipment, inventory and even human resources. The
explosion of technology over the last decade has made us re-think
what is valuable. In fact, what many businesses today consider to be
their most valuable asset cannot be held in your hand because it is
the information generated by the collection of billions of bits of data. In
fact, the data that businesses gather about their customers is, to the
most progressive companies, invaluable!
Companies take that “data” and turn it into useful information.

As the collection of data becomes easier and more cost effective,


businesses are constantly generating new and better information about
the business environment.
Data vs. Information
Data can be any character, text, word, number, and, if not put into
context, means little or nothing to a human.

Information is data formatted in a manner that allows it to be utilized


by human beings in some significant way.
An individual has an almost unlimited amount of data associated with
him or herself.

This data is of little use to business in it’s raw, unorganized form.

It is not until the data is formatted or compiled into something


meaningful that business has information about the individual.
For example, suppose the department store Big Box is collecting data
about its customers from a loyalty card program and online customer
surveys. It collects the following data about a particular customer:

• Age: 34
• Big Box Account #: 123456
• Gender: Female
• Zip Code: 22322
• Children: 2
• Marital Status: Married
• Last Purchase: Jogging Pants
Later in the year, Customer #123456 makes an online purchase of a pair
of men’s work boots and a men’s heavyweight coat. The data that
comes into Big Box may look like this:

• Customer #123456
• Date: 10/5/2018
• Item #56-9876 Cougar Work Boots, Size 11
• Item #43-2341 Men’s Heavyweight Denim Coat, Size XL
Business Data
The type of data a business collects is informed by a business’s goals
and objectives.

Computing systems can collect a dizzying array of data about the world
around us. Businesses must decide what type of data they need to
inform their business decisions and where and how that data can be
collected.
The types of data that businesses collect can be broken down into five
broad categories:

• business process
• physical world observations,
• biological data,
• public data and personal data.
Business Process Data
In order to remain competitive businesses must find ways to increase
efficiency while maintaining quality standards for their products, goods and
services. In order to continuously improve their operations, businesses
collect data regarding their business processes. This data can range from
collecting data on the number of days it takes their customers to pay invoices
to the time it takes to assemble and package a product. In order to collect
this type of data, many businesses employ enterprise resource planning
systems. ERP systems track business resources—cash, raw
materials, production capacity—and the status of business commitments:
orders, purchase orders, and payroll. The applications that make up the
system share data across various departments (manufacturing, purchasing,
sales, accounting, etc.) that provide the data.
• Another source of process data is Point of Sale (POS) systems.

• When a cashier scans the barcode on an item that scan collects data
that may be used in inventory management, loyalty programs,
supplier records, bookkeeping, issuing of purchase orders, quotations
and stock transfers, sales reporting and in some cases networking to
distribution centers. The more data a business has about its processes
the more likely it will find opportunities to improve or enhance those
processes.
Physical-world observations.

Technology has made it possible for business to capture real-time data


about the physical world. This data is collected by the use of devices
such as radio frequency identification (RFID), wireless remote cameras,
GPS, sensor technology and wireless access points. By inserting
computer chips into almost any object companies are able to track the
movements of that item and in some cases control the object.
Biological Data
If you have a newer smartphone, then you may be able to unlock your
phone by simply looking at the screen. This is made possible by facial
recognition software. Unlocking your laptop with your fingerprint is
another example of biological data available to businesses. Although
things like voice and face recognition, retinal scans and biometric
signatures are currently used primarily for security purposes, it may be
possible in the future for this type of data to allow for product and
service customization.
Public Data
Businesses have an almost endless source of data available to them
free from public sources. Whenever you log onto the Internet, use
instant messaging, or send emails, an electronic footprint is left behind.
For now this data is considered to be “public” and businesses collect,
share and even sell this type of data every day.
Personal Data
Much like data that is considered to be “public” data, as we use
technology we provide a wealth of personal data that businesses can
use to reveal much about our personal preferences, habits, pastimes,
likes and dislikes. For example, Facebook uses information people
provide — such as their age, gender and interests — to target ads to a
specific audience.
The volume of data available to businesses continues to increase
exponentially and as more and more data becomes available collecting,
storing and analyzing that data becomes increasingly complex. This
data explosion has made data warehousing and data mining of greater
importance to businesses.
Data Mining and Warehousing
• Did you ever think about how much data you yourself generate?
Warehousing and Mining Data
• A data warehouse collects data from multiple sources (both internal
and external) and stores the data to later be used in an analysis. The
primary purpose of a data warehouse is to store the data in a way
that it can later be retrieved for use by the business.

• Data Mining is not the process of getting specific pieces of data out
of the data warehouse, but rather the goal of data mining is the
identification of patterns and knowledge from large amounts of data.
Data mining tools such as Scrapy, Nutch and Splash allow businesses to
learn more about customers, competitors, compare prices and even
find new customers and sales targets. As the quantity of data
businesses can collect continues to grow, having an effective data
warehousing system that can be easily mined has become increasingly
critical to business success.
Data Warehouse
What is a data warehouse?

• A data warehouse, or enterprise data warehouse (EDW), is a system


that aggregates data from different sources into a single, central,
consistent data store to support data analysis, data mining, artificial
intelligence (AI), and machine learning. A data warehouse system
enables an organization to run powerful analytics on huge volumes
(petabytes and petabytes) of historical data in ways that a standard
database cannot.
A typical data warehouse often includes the following elements:
• A relational database to store and manage data
• An extraction, loading, and transformation (ELT) solution for
preparing the data for analysis
• Statistical analysis, reporting, and data mining capabilities
• Client analysis tools for visualizing and presenting data to business
users
• Other, more sophisticated analytical applications that generate
actionable information by applying data science and artificial
intelligence (AI) algorithms, or graph and spatial features that enable
more kinds of analysis of data at scale
Benefits of a Data Warehouse
Four unique characteristics (described by computer scientist William Inmon,
who is considered the father of the data warehouse) allow data warehouses
to deliver this overarching benefit. According to this definition, data
warehouses are
• Subject-oriented. They can analyze data about a particular subject or
functional area (such as sales).
• Integrated. Data warehouses create consistency among different data
types from disparate sources.
• Nonvolatile. Once data is in a data warehouse, it’s stable and doesn’t
change.
• Time-variant. Data warehouse analysis looks at change over time.
Data warehouse architecture
The architecture of a data warehouse is determined by the organization’s specific
needs. Common architectures include
• Simple. All data warehouses share a basic design in which metadata, summary
data, and raw data are stored within the central repository of the warehouse. The
repository is fed by data sources on one end and accessed by end users for
analysis, reporting, and mining on the other end.
• Simple with a staging area. Operational data must be cleaned and processed
before being put in the warehouse. Although this can be done programmatically,
many data warehouses add a staging area for data before it enters the
warehouse, to simplify data preparation.
• Hub and spoke. Adding data marts between the central repository and end users
allows an organization to customize its data warehouse to serve various lines of
business. When the data is ready for use, it is moved to the appropriate data
mart.
• Sandboxes. Sandboxes are private, secure, safe areas that allow companies to
quickly and informally explore new datasets or ways of analyzing data without
having to conform to or comply with the formal rules and protocol of the data
warehouse.
Generally speaking, data warehouses have a three-tier architecture,
which consists of a:

• Bottom tier: The bottom tier consists of a data warehouse server,


usually a relational database system, which collects, cleanses, and
transforms data from multiple data sources through a process known
as Extract, Transform, and Load (ETL) or a process known as Extract,
Load, and Transform (ELT).
• Middle tier: The middle tier consists of an OLAP (i.e. online analytical
processing) server which enables fast query speeds. Three types of
OLAP models can be used in this tier, which are known as ROLAP,
MOLAP and HOLAP. The type of OLAP model used is dependent on
the type of database system that exists.

• Top tier: The top tier is represented by some kind of front-end user
interface or reporting tool, which enables end users to conduct ad-
hoc data analysis on their business data.
Data Mart
• The data mart is that portion of the access layer of the data
warehouse which is utilized by the end user. Therefore, data mart is
a subset of the data warehouse. Data mart is usually assigned to a
specific business unit within the enterprise. Data mart is used to slice
data warehouse into a different business unit. Typically, ownership of
the data mart is given to that particular business unit or department.

• The primary utility of data mart is business intelligence. A data mart


requires very less investment compared to data warehouse and
therefore it is apt for smaller business. Set up time for data mart is
very less again making it practical for smaller business.
The main advantages of data mart are as follows:

• It provides easy access to daily used data.


• It improves decision making process for end user.
• It is easy to create and maintain.
A data mart is a database focused on addressing the concerns of a
specific problem (e.g., increasing customer retention, improving
product quality) or business unit (e.g., marketing, engineering).

Marts and warehouses may contain huge volumes of data. For


example, a firm may not need to keep large amounts of historical point-
of-sale or transaction data in its operational systems, but it might want
past data in its data mart so that managers can hunt for patterns and
trends that occur over time.
Data Mart vs Data Warehouse
Data marts and data warehouses are both highly structured
repositories where data is stored and managed until it is needed.
However, they differ in the scope of data stored: data warehouses are
built to serve as the central store of data for the entire business,
whereas a data mart fulfills the request of a specific division or
business function.
Because a data warehouse contains data for the entire company, it is
best practice to have strictly control who can access it. Additionally,
querying the data you need in a data warehouse is an incredibly
difficult task for the business. Thus, the primary purpose of a data mart
is to isolate—or partition—a smaller set of data from a whole to
provide easier data access for the end consumers.
Types of Data Marts
1. Dependent Data Marts
A dependent data mart is created from an existing enterprise data
warehouse. It is the top-down approach that begins with storing all
business data in one central location, then extracts a clearly defined
portion of the data when needed for analysis.
To form a data warehouse, a specific set of data is aggregated (formed
into a cluster) from the warehouse, restructured, then loaded to the
data mart where it can be queried. It can be a logical view or physical
subset of the data warehouse:
• Logical view - A virtual table/view that is logically—but not
physically—separated from the data warehouse
• Physical subset - Data extract that is a physically separate database
from the data warehouse
2. Independent Data Marts
An independent data mart is a stand-alone system—created without
the use of a data warehouse—that focuses on one subject area or
business function. Data is extracted from internal or external data
sources (or both), processed, then loaded to the data mart repository
where it is stored until needed for business analytics.

3. Hybrid Data Marts


A hybrid data mart combines data from an existing data warehouse
and other operational source systems. It unites the speed and end-
user focus of a top-down approach with the benefits of the
enterprise-level integration of the bottom-up method.
Data Mining
Data mining is the old name for data science. In 1990s,
finding patterns in large datasets was called knowledge
discovery or data mining. In 2010s with the growing
popularity of data science, data science started to replace
data mining as a term.

Data science/data mining assists companies in


transforming their raw data into practical knowledge and
optimizing their strategies by predicting outcomes using
machine learning algorithms.
Some of the data sources include:

1. Flat Files: They are data files in text form or binary and
represented by data dictionary (e.g: CSV file).

2. Relational Databases: The data that is organized in tables with


rows and columns are relational databases, represented in SQL.
3. Data Warehouse: The warehouses are integrated data derived
from multiple sources. There are three types of data warehouse:
- Enterprise warehouse is the collection of company business data and
information about its customers.
- Data mart warehouse is a smaller subset of the broader data
warehouse with a focus on a business line or department (e.g: finance or
marketing).
- Virtual warehouse is a business database copying from multiple
sources throughout a production system to provide a comprehensive
view of assets and materials.

4. Transactional Databases: These data are organized by time


stamps and date to represent transaction in databases.
5. Multimedia Databases: These databases consists audio, video,
images and text media
6. Spatial Databases: They contains geographical information.
7. Time Series Databases: These databases contains information
varying across time. Some examples are stock exchange data and
user logged activities.
8. Online resources : These are documents and resources (e.g:
audio, video and text) which are identified by Uniform Resource
Locators (URLs). These are web resources, linked by HTML pages,
and accessible via the Internet network. This data can be collected
using web crawlers to automatically extract this data and send it
to users for further manipulation or analysis.
How does data mining help business intelligence?

Data mining is a collection of techniques for generating


insights from data. These techniques can be used to
generate insights from business data, contributing to
business intelligence (BI).

Some of the ways data mining techniques are used in BI


include:
Data preparation

Data mining techniques can be used to clean and prepare the


data for analysis. 80% of the data is unstructured when first
collected, so it requires cleaning and wrangling before being
delivered to the business intelligence team to derive insights from
it. For example, data mining algorithms can convert images or
documents into machine readable data for analysis by business
intelligence tools.
Detecting specific trends

Data mining enables businesses to identify the root cause of a


specific issue/trend, predict outcomes, identify anomalies. These
inputs help business intelligence teams identify actions to take to
improve the business.
What are the benefits of data mining in business
intelligence?

In the domain of business intelligence (BI), data


mining has the applications listed below. These
applications each have specific benefits. Please note
that this is a high level list, more granular applications
of data mining also exist in BI
Business analysis:

Organizations’ data includes information about internal structure


of the company and lines of business (e.g: sales, logistics,
manufacturing). Leveraging data mining on operations data
reveals information about processes to improve. Understanding
the data and applying strategies to improve processes can
improve the company’s efficiency (reducing costs) and increase its
effectiveness (improving the quality of its products & services).
Customer analysis:

Customer data exhibit preferences, thoughts, needs, demands,


and intents of target prospects and customers. Applying data
mining on customer data:
- provides insights about customer purchase trends and
seasonal needs in order make predictions on decisions, actions
and product launches.
- helps company to prioritize initiatives to respond to customer
needs and demands.
Market analysis:

Constant collection of real-time data about the market and


industry gives businesses data to be used in data mining/data
science to make predictions about the market, competitors, and
customers, and enables companies to discover new business
opportunities.
Data visualization
Data visualization is defined to understand the patterns and insights of the
data by transforming them into visual context by taking different forms
like pivot tables, line graphs, pie charts.

Business Intelligence represents a set of technologies that provides few


metrics for business users.

Layman Terms defines what had happened and breaks down contributing
factors.
The report generated by a Business Intelligence solution could be a
visualization.
The main purpose of data visualization is to communicate
information clearly and effectively through graphical means. But
that doesn’t mean that data visualization needs to look boring to
be functional or extremely sophisticated to look beautiful.

The idea is to create both aesthetic and functional data


visualizations in order to provide insights and intuitive ways of
perceiving complex data

You might also like