Business Analytics Anna University 2 Mark Questions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

unit – 3

2Marks

1. What is Descriptive Analytics?


Descriptive analytics is the most common and fundamental form of analytics
that companies use. Every part of the business can use descriptive analytics to keep
tabs on operational performance and monitor trends. Examples of descriptive
analytics include KPIs such as year-on-year percentage sales growth, revenue per
customer and the average time customers take to pay bills. The products of descriptive
analytics appear in financial statements, other reports, dashboards and presentations.

2. List out the functions of descriptive analytics?


1) Company's Current Performance: descriptive analytics helps businesses keep
track of critical metrics involving individuals, groups and teams, and the company
as a whole. For example, descriptive analytics can show how a specific sales rep
is doing this quarter or which of the rep's products sells the most.
2) Business's Historical Trends: Descriptive analytics gathers information over
long periods, and that accumulated information can be used to track the company's
progress by comparing the metrics for different periods. For example, the
corporate bean counters can track sales or expenses by comparing the results of
various quarters, calculating revenue growth by percentages, and rendering the
results on easy-to-read charts.
3) Company's Strong and Weak Points: Descriptive analytics gives professionals
the tools to compare the performances of various business groups using metrics
like employee-generated revenue or expenses as a percentage of revenue. It will
also compare these results with known industry averages or published results from
other business. These comparisons help companies see where they're doing well
and where they need to improve.
3. Types of Descriptive Analytics?
1) Measures of Frequency
2) Measures of Central Tendency
3) Measures of Dispersion
4) Measures of Position
4. Techniques for Descriptive Analytics?
Data aggregation and data mining are two techniques used in descriptive
analysis to churn out historical data. In Data aggregation, data is first collected and
then sorted in order to make the datasets more manageable.
5. Steps involved in Descriptive Analytics
1) State the Business Metrics
2) Identify the Data Required
3) Extract and Prepare the Data
4) Analyse the Data
5) Present the Data
6. Advantages of Descriptive Analytics

This type of analysis is considered as a better method for collecting


information that describes relationships as natural and exhibits the world as it exists.
This reason makes this analysis very real and close to humanity as all the trends are
made after research about the real-life behaviour of the data.

7. Data Visualisation

Data visualisation refers to technologies that support visualisation and


sometimes interpretation of data and information at several points along the data
processing chain. It includes digital images, GIS, graphical user interfaces, graphs,
virtual reality, dimensional presentations, videos, and animation. Visual tools can help
to identify relationship such as trends. Data visualisation is easier to implement when
the necessary data are in a data warehouse or, better yet, in a multidimensional special
database or server.

8. What is population?
9. What is sampling?
Sampling is a process in which the fixed numbers of observations are taken
randomly from a larger population.
10. List out the characteristics of good sample design
1) True Representative
2) Free from Bias
3) Accurate
4) Comprehensive
5) Approachable
6) Good Size
7) Feasible
8) Goal Orientation
9) Practical
10) Economical
11. Advantages of Sampling
1) Saves Time, Money and Effort
2) More Effective
3) Faster and Cheaper
4) More Accurate
5) Gives More Comprehensive Information
12. Disadvantages of Sampling
1) Biased Selection
2) Difficulty in Selection
3) Specialised Knowledge Needed
4) Problem of Cooperation
5) Less Accuracy
6) Limited Nature
13. Sampling Methods
Thus, sample designs are basically of two types, viz., probability sampling and
non-probability sampling as shown in figure below:
14. Sample Random Sampling
This is the most famous and simple method of sampling where each unit of the
population is equally probable of getting included in the sample. Let us consider that
the size of the population is 'N' from which n units are to be selected at random for a
sample such that ncn sample has the probability of being selected equally. Simple
random sampling says that:
15. Systematic Sampling
After the selection of one unit at random from the universe the other units are
selected systematically at a specified interval of time. This method is applicable when
the size of the population is finite and on the basis of any system the units of the
universe are arranged such as alphabetic arrangement, numerical arrangement, or
geographical arrangements.
16. Stratified Random Sampling
In the Stratified Random Sampling, the sample is selected from different
homogeneous strata or parts of a universe of heterogeneous universe as a whole. The
summary of this sampling procedure is as follows.
17. Cluster Sampling
According to this method there is further noticeable sub-division of the
universe into clusters. Simple random sampling is performed and clusters are drawn
accordingly constituting a sample of all the units belonging to the selected clusters.
For example, if we have to conduct a survey in the city of Mumbai, then the city may
be divided into, say, 40 blocks and out of these 40 blocks, 5 blocks can be picked up
by random sampling and the people in these five blocks are interviewed to give their
opinion on a particular issue. The clusters chosen should be small in size, i.e., more or
less the same number of sample units should be there in each cluster. This method is
used in the collection of data about some common traits of the population.
18. Multi-Stage Sampling
Modification of cluster sampling is multi-stage sampling where in cluster
sampling, a sample is constituted by all the units selected in a cluster but in multi-
stage sampling where in cluster sampling the selection of the sample units is in two,
three, four stages. Firstly , the universe is divided into first-stage sample units and
then further sub-divided into second stage sampling is performed in the same way.
For example, in an urban survey first stage sampling fourth stage sampling is will be
of selection of towns and then for each selected as samples of third-stage sampling.
19. Area Sampling
The area sampling is a form of multi-stage sampling in which maps instead of
lists or registers is used as 'sampling frame' . It is commonly used in those countries
which do not have proper sampling frame like a population list. For geographic sub-
divisions, 'clusters sampling' is the other name of 'area sampling' . The cluster of units
based non geographical area is the primary sampling units known as 'cluster designs'
which are famous as area sampling. The positive and negative features of cluster
sampling area also applicable to area sampling.
20. Non-Probability Sampling Methods
Non-probability sampling is that type of sampling procedure which does not
have any ground for estimating the probability that whether or not each item in the
population has been included in the sample. There are different names of non-
probability sampling such as deliberate sampling, purposive sampling and judgement
sampling. In this type of sampling, the researcher deliberately selects items for the
sample and the choice of researcher regarding the item is provided more weightage. In
other words, under non-probability sampling the organiser of the inquiry purposively
chooses specific units of the universe to constitute a sample on the basis that the small
portion selected by him, out of a huge one is typical or representative of the whole
universe. The various non-probability sampling designs are:
Non-Probability Sampling Methods

Convenience Sampling
Purposive Sampling
Panel Sampling
Snowball Sampling

21. Convenience Sampling


On the basis of convenience and approachability, the choice of the sampling
units by the researcher, is known as 'convenience sampling' . Samples that are selected
accidentally are called 'accidental samples' . Because of the selection procedure (units
are selected from their actual place) it is called as 'sample of the man in the street' .
Due to their accessibility , samples units are selected. For example, by adding the
new product in the nearby suitable shops, the potential of the product is tested. This is
accomplished by observing the purchasing and selling report of the product.
22. Purposive Sampling
A non-probability sample which follows certain norms is called purposive
sampling. Purposive sampling is basically of two types
1) Judgement Sampling
The study which is based on the parameters of population, where the units are
selected by a researcher or some other expert on his/her judgement, is called
'judgement sampling'. This technique of sampling is appropriate in the situation
where the study of the population is difficult to locate or there are members who
are comparatively better than other for an interview in terms of knowledge or
interest.
2) Quota Sampling
Quota sampling is the most commonly used non-probability sample designs,
which is most comprehensively used in consumer surveys. Principle of
stratification is also used by this sampling method. In stratified random sampling
the researcher begins by building strata. The common bases for stratification in
consumer surveys are demographic, e.g., age, gender, income and so on.
Compound stratification is generally used, e.g., gender-wise age groups.

23. Snowball Sampling


In panel sampling a group of participants are selected initially by random
sampling method and the same group is asked for the same information repeated
number of times during that period of time. This sample is semi-permanent where
members are included repeatedly for iterative studies. In this sampling, there is a
facility of selecting and contacting samples that fit i getting high response rate, even
by mail.
24. Sampling Distribution
When the characteristic of the desired sample is limited then the special non-
probability method is applicable. In this method, it is difficult to locate the
respondents because it will be very costly. Depending on the referrals of the initial
subjects sampling generates additional subjects. Though this technique is biased and
unable to represent a good cross-section from the population but dramatically it
reduces the search cost.
25. Factors Influencing Sample Size
The following points should be taken care of while deciding the sample size:
1) Size of Universe: It has been: It has been observed and statistically proven that
the sample size should be large so that it can represent the whole target
population. If the universe size is large and heterogeneous then the researcher
should take large sample size and vice versa.
2) Availability of Resources: The researcher needs a lot of resources to complete
the task of research. These resources. Resources can be money, time, experts or
any other variable. If the resources are easily available than the sample size can be
large otherwise smaller sample would be appropriate.

Factors Influencing Sample Size: Sample


Size constraints

Size of the Universe


Availability of Resources
Level of Accuracy Required
Homogeneity or Heterogeneity
of the Universe
Nature of Study
Selection of Sampling Technique
Attitude of Respondents
Degree of Variability

26. Estimation
When a researcher makes inferences about a population then this process is
known as estimation. The inferences are concluded on the basis of information
obtained from the sample. A statistic is any measurable quantity that is concludes
from a data sample (e.g. the average). For a given variable it is a stochastic variable.
General it is vary from sample to sample.
27. Estimator and Estimate
A sample statistic is used when one makes an estimate of a population
parameter. This sample statistic is the 'estimator'.
28. Characteristics of Estimation
The specific value of the sample statistic used to estimate a population
parameter is known as 'point estimate'. Point estimation manages the task of
selecting a particular sample value which is an estimate for a population parameter.
Point estimation of some population parameter is shown in figure 3.14. The
population parameter of interest might be the mean, variance, standard deviation,
proportion or any other characteristic of the population. Collection of random sample
to estimate the value of an unknown population parameter typically comprises of 'n'
observations of the variable of interest. The estimator of the population parameter is a
function of these sample observations.
29. Point Estimation
The fixed interval of scores where the population's mean or some other parameter is
expected to fall when that parameter is to be estimated from the given sample data is
known as 'interval estimation' .
30. Internal
31. Important teams related to probability Distribution
32. Types of Probability Distribution
The various types of Probability Distributions are as shown in figure below:

Types of Probability Distribution

Discrete Probability
Distributions Continuous Probability
Distributions
Bin Binomial Distribution
Poisson Distribution Uniform Probability
Distribution
Exponential Probability
Distribution
Normal Probability
Distribution
Student's Distribution
Chi-Square Distribution
F Distribution
33. Discrete Probability Distribution
In a probability distribution of a random variable X, in which X can only take
the values of distribution when its random variable is a discrete variable. The
following examples illustrate discrete probability distributions:
i) A car can have only 0,1,2,3 or 4 flat tyres.
ii) A bookshop has only 0,1,2,3,4 or 5 copies of a particular title in stock
iii) The number of employees absent on a given day is 0,1,2,3
34. Continuous Probability Distribution
The probability distribution of a random variable is called continuous
Probability Distribution if the given random variable is continuous. Continuous
variables contain very large number of outcomes (which are in fact uncountable) but
discrete variables contain finite number of outcome values. In continuous random
variables, the probability of a single outcome will be close to zero since the total
probability '1' is divided among a very large number of outcomes. For continuous
random variables, calculation of probability for a single value may not be significant.
Some examples that will clarify the difference between discrete and continuous
variables are
35. Analysis of descriptive analytics
Step 1) State the Business Metrics
Step 2) Identify the Data Required
Step 3) Extract and Prepare the Data
Step 4) Analyze the Data
Step 5) Present the Data

unit-4
Two Marks
1. Predictive Analytics
Predictive analytics is an area of data mining that deals with extracting
information from data using it to predict trends and behaviour patterns. Often the
unknown event of interest is in the future, but predictive analytics can be applied
to any type of unknown, whether it be in the past, present or future. For example,
identifying suspects after a crime has been committed, or credit card fraud as it
occurs.
2. Procedure involved in predictive Analytics
Figure 4.2 shows the basic steps involved in the predictive analytics process, which
are describe below:

Predictive
Analytics

6) Deployment 1) Define Project

5) Modelling 6) Data Collection

4) Statistics 3) Data Analysis

3. Advantages of Predictive Analytics


1) Detecting Fraud
2) Optimising Marketing Campaigns
3) Improving Operations
4) Reducing Risk
4. Application of Predictive Analytics
1) Banking and Financial Services
2) Oil, Gas and Utilities
3) Health Insurance
4) Retail
5) Governments and the Public Sector
6) Manufacturing
5. Principles of Predictive Models
1) Definition and Support Principles
i) Principle of Similarity
ii) Principle of Extensibility
iii) Principle of Robustness
iv) Principle of Fault Tolerance
v) Principle of Ease of Control
vi) Principle of Completeness
2) Interference Principles
i) Lack of Knowledge Principle
ii) Lack of Concern Principle
iii) Lack of Definition Principle
iv) Lack of Engineering Principle
v) Lack of Responsibility Principle
6. Types of Predictive Models
Predictive modelling means developing models that can be used to forecast or
predict future events. In business analytics, models can be developed based on logic
or data.

Types of predictive Models

Logic Driven Predictive Data Driven Predictive


Modlesl Models

7. Single-Period Purchase Decisions


Single-Period purchase decisions are the one-time purchase decisions that are often
required to be made in the event of uncertain demand. There occur many situations in
which companies have to make a single time decision.
8. Multiple Time Period Models
Multiple time period models accommodate multiple and even infinitely many periods.
Under these models, several issues have to be assessed:
1) How to define assets in an multi-period model,
2) How to model inter-temporal preferences,
3) What market completeness means in this environment,
4) How the infinite horizon may the sensible definition of a budget constraint, and
5) How the infinite horizon may affect pricing.
9. Overbooking Decisions
Overbooking occurs when a firm with constrained capacity sells more units of
inventory than they have available. Overbooking is applicable in industries with the
following characteristics:
1) Capacity (or supply)is constrained and perishable, and bookings are accepted for
future use.
2) Customers are allowed to cancel or no-show
3) The cost of denying service to a customer with a booking is relatively low.
10. Retail Pricing Markdowns Model
Most department stores and fashion clear their seasonal inventory by reducing prices.
The key question they face is what prices should they set and when should they set
them to meet inventory goals and maximise revenue? For example, suppose that a
store has 1000 summer tees of a certain style that go on sale April 1 and wants to see
all of them by the end of June.
11. Modeling relationship and treads in Data
A chain of departments stores is introducing a new brand of bathing suit for Rs.70 .
The prime selling season is 50 days during the late spring and early summer; after
that, the store has a clearance sale around July 4 and marks down the price by 70% (to
Rs.21.00), typically selling any remaining inventory at the clearance price.
12. Data Mining
In technical terms, the process of identifying the various trends and correlations
among the number of fields in huge relational database is known as data mining.
13. Need for Data Mining
1) Operational
2) Decisional
3) Informational
4) Specific Applications
14. Components of Data Mining
The various components of a typical data mining system are described in figure 4.12
and explained as below:

Graphical User Interface

Pattern Evaluation

Data Mining Engine Knowledge


Base

Database or Data Warehouse


Server
15. Data Mining methodologies
Data mining is an ideal Predictive analytics tool used in the BA process. Some of the same
tools used in the descriptive analytics step are used in the predictive step but are employed to
establish a model (either based on logical connections or quantitative formulas) that may be
useful in predicting the future. Several computer-based methodologies are explained below:
Data Mining Methodologies
Summarisation
Association
Classification
Clustering
Trend Analysis
16. Data Mining Techniques
The various methods of data mining are explained as below:
Data mining Techniques
Cluster Analysis
Neural Networks
Data Visualisation
Induction
Online Analytical Processing
17. Induction
A database can be seen as the storage of huge data but the most vital part of this is the
information which can be retrieved from it. There are primarily two methods for interface
which are discussed below:
i) The method of concluding the information which is a logical outcome of the
information stored in the database is known as deduction. For example, in two
relational tables, a joint operator is implemented where the first table is associated
with the employees and different divisions and second table includes the departments
and managers which in turn deduce a relationship between the managers and
employees.
18. Neural Networks
Neural network can be seen as one of the method of computing. It mainly deals with
the development of various mathematical structures having the capability of learning. The
various academic investigations related to modelling of the nervous system learning gave
birth to these methods. The task of concluding the relevant and meaningful information from
the incomplete or complicated data is accomplished quite remarkably through neural
networks.
19. Data Visualisation
The more informative and spontaneous knowledge about the data can be obtained by the
analyst through data visualisation and this can be implemented along with the data mining.
The analyst can concentrate on certain trends and patterns with the help of data mining and
visualisation helps in detailed exploration of data.
There can be a huge collection of data and this volume of data in a database can overpower
the visualisation but the data exploration can be supported along with data mining.
20. Data Mining Process
1) Data collection
2) Feature Extraction and Data Cleaning
3) Analytical Processing and Algorithms
21. Advantages of Data Mining
1) Automated Forecasting of Trends and Behaviours
2) Automated Determination of Earlier Unknown Trends
3) Extensive Depth and Breadth of Database
22. Disadvantages of Data Mining
1) Privacy
2) Security
3) Misuse of Information/Inaccurate Information
23. Applications of Data Mining
1) Retail/Marketing
2) Banking
3) Insurance and Health Care
4) Transportation
5) Medicine
24. Challenges of Data Mining
1) Security and Social Challenges
2) User Interface
3) Mining Methodology Challenges
4) Complex Data
5) Performance
unit-5
Prescriptive Analytics
1. Prescriptive Analytics
It is focus on achieving the best possible outcome by going beyond the forecast to
actually determine the optimal decision to make For example, this would help an organisation
answer how best to allocate capital, people, and facilities to achieve business results such as
reduced time or cost, or increased return on investment. Prescriptive analytics leverage
optimisation methods to obtain their prescriptive results.
2. Prescriptive Analytics Techniques
1) Simulation Optimisation: Simulation optimisation combines the use of probability and
statistics to model uncertainty with optimisation techniques to find good decisions in
highly complex and highly uncertain settings.
2) Decision Analysis: The techniques of decision analysis can be used to develop an
optimal strategy when a decision-maker is fared with several decision alternatives and
uncertain set of future events. Decision analysis also employs utility theory, which
assigns valued to outcomes based on the decision-maker's attitude toward risk, loss, and
other factors.
3. Applications of Prescriptive Analytics
1) Banking Financial Services and Insurance (BFSI)
2) Healthcare
3) Online Learning
4) Transportation and Travel
5) Supply Chain and Logistics
6) Manufacturing
7) Marketing and Sales
4. Benefits of Prescriptive Analytics
Following are the benefits of prescriptive analytics:
Benefits of Prescriptive Analytics

More Proactive
Capturing Multiple Data Touchpoints and
Formats
Real-Time Insights

Finding the Right Trade-off


Maximum Use of Resources
Gross Margin Management
Enhanced Market Competition Ananlysis

Removing Bottlenecks

5. Challenges with Prescriptive Analytics


1) Difficult to Define a Fitness Function
2) Human Bias in Models
3) Complex Constraints
6. Types of Prescriptive Modeling
The listing of prescriptive analytic methods and models below is but a small grouping
of many operations research, decision science, and management science methodologies that
are applied in this step of the BA process.

Types of Prescriptive Modeling

Linear Programming
Integer Programming
Non-Linear Optimisation
Decision Analysis
Case Studies
Simulation

Other Methodologies
7. Linear Programming
A general-purpose modelling methodology is applied to multi-constrained,
multivariable problems when an optimal solution is sought. It is ideal for complex and
large-scale problems when limited resources are being allocated to multiple uses.
Examples include allocating advertising budgets to differing media, allocating human
and technology resources to product production, and optimising blends of mixing
ingredients to minimise costs of food products.
8. Integer Programming
This is the same as LP, but it permits decision variables to be integer values.
Examples include allocating stocks to portfolios, allocating personnel to jobs, and
allocating types of crops to farm lands.
9. Non-Linear Optimisation
A large class of methodologies and algorithms is used to analyse and solve for
optimal or near-optimal solutions when the behaviour of the data is non-linear.
Examples include solving for optimised allocations of human, technology, and
systems whose data appears to form a cost or profit function that is quadratic, cubic,
or nonlinear in some way.
10. Decision Analysis
A set of methodologies, models, or principles is used to analyse and guide
decision-making when multiple choices face the decision maker in differing decision
environments (e.g., certainly, risk, and uncertainty). Examples include selecting one
from a set of computer systems, trucks, or site locations for a service facility.
11. Case Studies
A learning aid provides practical experience by offering real or hypothetical
case studies of real-world applications of BA.
For example, case studies can simulate the issues and challenges in an actual problem
setting. This kind of simulation can prep decision makers to anticipate and prepare for
what has been predicted to occur by the predicted analytics step in the BA process.
For example, a case study discussion on how to cope with organisation growth might
provide a useful decision-making environment for a firm whose analytics have
predicted growth in the near future.
12. Simulation
This methodology can be used in prescriptive analysis in situations where parameters
are probabilistic, non-linear, or just too complex to use with other optimisation
models that require deterministic or linear behaviour.

You might also like