Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Chapter

 1  Exam  Review
MDM4U  
Jensen  
 
 
Section  1.2  –  Displaying  Categorical  Data  
 
1)  A  researcher  asked  150  high  school  students  what  their  favourite  fast  food  restaurant  was.  The  results  
are  in  the  table  below:  
 
Restaurant   Number  of  Students   Relative  Frequency  
McDonald’s   22    
Wendy’s   38    
Subway   22    
Harvey’s   11    
Pizza  Pizza   29    
A&W   6    
KFC   9    
Other   13    
 
a)  What  type  of  variable  is  ‘favourite  fast  food  restaurant?’  
 
 
b)  Would  it  be  more  appropriate  to  make  a  histogram  or  bar  graph  to  display  this  data?  
 
 
c)  Complete  the  relative  frequency  column  
 
 
d)  Display  the  relative  frequencies  using  a  bar  graph  or  histogram.  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2)  A  student  is  interested  in  whether  there  is  a  relationship  between  gender  and  major  at  her  college.  She  
randomly  sampled  some  men  and  women  on  campus  and  asked  them  if  their  major  was  part  of  the  
natural  sciences  (NS),  social  sciences  (SS),  or  humanities  (H).  Her  results  appear  in  the  table  below.    
 
 

    Major  

    NS   SS   H   Total  

15   22   18    
Women          
 
13   8   4    
Gender   Men          
 
 
Total          
 
 
a) Complete  the  totals  
 
b) To  determine  if  major  depends  on  gender  by  calculating  the  conditional  distribution  of  major  
based  on  gender  (row  percentages).  
 
c) Use  your  conditional  distribution  to  describe  the  relationship  between  the  variables.  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Section  1.3  –  Displaying  Quantitative  Data  
 
3)  The number of goals by Jaromir Jagr in each of his 21 NHL seasons is recorded below

27, 32, 34, 32, 32, 72, 47, 35, 44, 42, 52, 31, 36, 31, 54, 30, 25, 19, 16, 24, 17

a) Construct a stem-and-leaf plot to display the data


 
Stem   Leaf    
 
     
     
 
   
 
     
 
     
     
 
     
 
 
b)  Determine  the  percent  of  seasons  where  greater  than  45  goals  were  scored.  
 
 
 
 
c)  Use  the  chart  below  to  show  the  five  number  summary  for  Jagr’s  goals.  Also  compute  the  IQR.    [3]  
 
Max    
Min    
𝑸𝟏    
𝑸𝟐    
𝑸𝟑    
𝑰𝑸𝑹    
 
d)  Determine  if  there  are  any  outliers  in  the  data.  Show  your  work  including  upper  and  lower  threshold  
values.  
 
 
 
 
 
 
 
 
 
 
e)  Create  a  boxplot  to  display  the  data.    
 
 
 
 
 
 
 
 
 
 
 
 
4)  The  heights  of  the  2013  Toronto  Raptors  (in  centimeters)  are  listed  below:  
 
201,  183,  191,  211,  201,  201,  203,  213,  206,  206,  183,  208,  198,  198,  211  
 
a)  Determine  the  range  of  the  data.  
 
 
 
 
 
b)  Determine  an  appropriate  bin  width  that  will  divide  the  data  into  7  intervals.  
 
 
 
 
 
 
 
c)  Create  a  frequency  table  for  the  data  
 
 
  Height  Interval   Frequency  
     
 
     
     
 
     
     
 
   
 
     
 
 
 
 
 
 
d)  Create  a  histogram  of  the  data  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Section  1.5  –  Linear  Regression  Using  Technology  
 
5) Two variables have a correlation coefficient of r = 0.9. This indicates

a. a strong positive correlation c. a strong negative correlation


b. a weak positive correlation d. a weak negative correlation

6) If two variables have no correlation, their correlation coefficient would have a value of

a. 1 c. 100
b. 1 d. 0

7) Two variables have a coefficient of determination of 0.64. The correlation coefficient could be

a. 0.64 c. 0.8
b. 0.41 d. 0.36

8) A relationship in which all data values lie on the regression line has a correlation coefficient of

a. 1 c. 1
b. 0 d. +1 or 1
9) The regression line shown would have a correlation coefficient closest to

a. 1 c. 1
b. 0.5 d. 0

10) The residuals for a set of data represent the

a. differences between consecutive x-values


b. vertical differences between data points and the line of best fit
c. data points that lie below the line of best fit
d. data points that do not lie on the line of best fit

11) If a set of data has a very strong correlation, the residual values will be

a. very large c. negative


b. positive d. very small

12) A coefficient of determination, r = 0.75, indicates that

a. 75% of the data lie on the regression line


b. the slope of the regression line is 0.75
c. 75% of the variance in y can be explained by its approximate linear relationship with 𝑥
d. the data have a strong positive correlation

13) Which of the following is an example of a negative correlation?

a. amount of studying and mark on a test


b. temperature and number of kids at a pool
c. a person’s arm length and leg length
d. number of people and slices of pizza per person

14) A set of data having small residual values means that

a. the correlation coefficient is close to 0


b. there is a positive correlation
c. there is a negative correlation
d. there is a strong correlation
 
 
 
15)  A  positive  residual  value  means  that  the  data  point  lies:  
 
a. Close to the line of best fit
b. Above the line of best fit
c. On the line of best fit
d. Far away from the line of best fit
 

16)  What  type  of  linear  correlation  is  represented  when  the  correlation  coefficient  is  -­‐0.7?  
 
a. Strong negative
b. Moderate negative
c. Weak negative
d. No correlation
 
 
17)  What  type  of  correlation  is  represented  when  the  correlation  coefficient  is  0.41?  
 
a. Strong positive
b. Moderate positive
c. Weak positive
d. No correlation
 
 
18)  This  table  shows  the  data  for  the  full-­‐time  employees  of  a  small  company.    
 
Age  (year)   33   25   19   44   50   54   38   29  
Annual  Income  
33   31   18   52   56   60   44   35  
(in  thousands)  
 
Residuals                
 
 
a)  Construct  a  scatterplot  using  your  calculator  
 
b)  Find  the  equation  of  the  regression  line  and  interpret  the  slope  and  y-­‐intercept  in  context.  
 
 
 
 
 
 
 
 
 
 
 c)  Find  and  interpret  correlation  coefficient,  r.      
 
 
 
 
 
 d)  Find  the  coefficient  of  determination,  r2.    Interpret  it  in  the  context  of  this  data.  
 
 
 
 
 
 
e)  Calculate  the  residual  values,  record  them  and  analyze  them  using  the  residual  plot  to  help.  Is  a  linear  
model  a  good  fit?  
 

f) Using  the  linear  regression  equation,  what  would  you  predict  the  annual  income  of  a  40  year  old  to  be?
 
 
 
 
 
 
 

You might also like