SMDM Project Report: Submitted By: Kratika Vijayvergiya
SMDM Project Report: Submitted By: Kratika Vijayvergiya
Submitted by:
Kratika Vijayvergiya
Question 1.1 Use methods of descriptive statistics to summarize data. Which
Region and which Channel seems to spend more? Which Region and which
Channel seems to spend less?
Basic EDA
Question 1.2 There are 6 different varieties of items considered. Do all varieties
show similar behavior across Region and Channel?
Question 1.3 On the basis of the descriptive measure of variability, which item
shows the most inconsistent behavior? Which items show the least
inconsistent behavior?
Standard Deviation of the items
● Fresh items have the highest standard deviation- 12647.32, so they show the
most inconsistent behaviour.
● Delicatessen have the smallest standard deviation- 2820.10, so they are least
inconsistent.
Box plot shows the three quartile values of the distribution along with extreme
values. The “whiskers” extend to points that lie within 1.5 IQRs of the lower and
upper quartile, and then observations that fall outside this range are displayed
independently.
In this, All the varieties consist of outliers, thus are skewed.
Question 1.5 On the basis of this report, what are the recommendations?
Insights Summary:
Spend by Region
Max - Other
Min- Oporto
Spend by Channel-
Max- Hotel
Min- Retail
The behaviour of different varieties were also shown, which was highly inconsistent.
1. What is the probability that a randomly selected CMSU student will be male?
The probability that a randomly selected CMSU student will be male is 0.47
The probability that a randomly selected CMSU student will be female is 0.53
2.3.1 Find the conditional probability of different majors among the male
students in CMSU.
2.3.2 Find the conditional probability of different majors among the female
students of CMSU.
Similarly, the conditional probability of different majors among the female students of
CMSU is:
● Accounting- 0.091
● CIS- 0.091
● Economics or Finance- 0.212
● International business- 0.121
● Management- 0.121
● Other- 0.091
● Retailing or Market- 0.273
● Undecided- 0
a) Find the probability That a randomly chosen student is a male and intends
to graduate.
Prob(Male AND Intends to graduate) = P(M ∩ G)= M ∩ G / Total students
The probability that a randomly chosen student is a male and intends to graduate is
0.274
b) Find the probability that a randomly selected student is a female and does
NOT have a laptop.
Prob(Female AND Does not have a laptop) = P(F ∩ Lc)= (F ∩ Lc) / Total Students
The probability that a randomly selected student is a female and does NOT have a
laptop is 0.065
a) Find the probability that a randomly chosen student is either a male or has a
full-time employment?
The probability that a randomly chosen student is either a male or has full-time
employment is 0.516
The conditional probability that given a female student is randomly chosen, she is
majoring in international business or management.i Iis 0.2424.
To check whether, graduate intention and being female are independent events the
condition to be checked is: If being female and graduate intention are independent,
the P(F ∩ Yes) = P(F)P(Yes)
Conclusion:
Since, the P(F ∩ Yes) is not equal to P(F)P(Yes), they are not independent
events.
Question 2.7 Note that there are four numerical (continuous) variables in the
data set, GPA, Salary, Spending and Text Messages. Answer the following
questions based on the data
3.1) Do you think there is evidence that mean moisture contents in both types
of shingles are within the permissible limits? State your conclusions clearly
showing all steps.
P value for B shingles is 0.002, which is less than the 5% level of significance. So
the statistical decision rejects the null hypothesis.
Thus, we conclude that the population mean moisture content for B shingles is
greater 0.35 pound per 100 square feet, which means it is not within the
permissible limits.
Conclusion:
So at 95% confidence level there is sufficient evidence to prove that the mean
moisture content for Shingles A is within the permissible limits.
But for Shingles B it is not within the permissible limits.
3.2 Do you think that the population mean for shingles A and B are
equal? Form the hypothesis and conduct the test of the hypothesis.
What assumption do you need to check before the test for equality of
means is performed
In testing whether the population mean for shingles A and B are equal, the null
hypothesis states that the population mean for both the shingles are the same, μA
equals μB. The alternate hypothesis states that the the population mean for both the
shingles are different, μB is not equal to μB
Conclusion:
So at 95% confidence level there is sufficient evidence to prove that the
population mean for shingles A and B are different.
THE END