ANSWERS With Marks EC402: Econometrics
ANSWERS With Marks EC402: Econometrics
EC402
Econometrics
Instructions to candidates
This paper contains TWO sections. Answer FOUR questions from SECTION A and THREE
questions from SECTION B. Sections A carries 40% of the overall mark, while Section B carries
60% of the overall mark.
1. [10 marks] Consider the classic linear regression model with dependent S 1 vector y and
regressor S k matrix X, assumed to satisfy:
A1 : rank(X) = k
true true true
A2 : y = X + with E =0
The regressors X are assumed to be unrelated to the error term true in one of two alternative
ways:
Strongly exogenous de…ned as: A3Rmi : E( true jX) =E true
W eakly exogenous de…ned as: A3Rsru : E( true x ) = 0 for every regressor variable
s sj
converge by a suitable LLN to the corresponding true Moment expections; and whether an
appropriate CLT applies to give the asymptotic normality of expressions like
S
1 X true
p xsj s
S s=1
But the statement is FALSE that Strong Exogeneity is not useful when we have large sam-
ples/Asymptotics because the (b) part of the di¤erence above (Same Row vs. Any Rows) is
critical in many contexts, even then: the unit of observation being s is applicable to whichever
type of data set we might have, be it Cross-Section, Time-Series or Panel/Longitudinal. Hence
2
+ 3
+ 4
= 1 and
( 5 2)4 + ( 6 7)
2
= 0
The researcher argues that given the relatively small sample size, he prefers to avoid using the
Wald testing approach in favour of the Likelihood Ratio approach. He explains that the Wald
approach would require the nonlinear “Delta Method” to implement, and hence be unreliable
with the rather small sample size. Discuss critically his arguments.
ANSWER:
The researcher’s argument is FALSE for at least two reasons:
(a) Even the LR approach of the two conditions exactly as stated, would still involve Nonlinear
estimation, since the LR relies on the maximized LLF for the Unrestricted as well as the Re-
stricted models and compares the two. The “R”model would involve the second (apparently)
non-linear condition as well as the …rst linear condition, and hence would need asymptotic
arguments to derive the estimator and test properties
(b) The second condition is not really non-linear, but it corresponds to the two *linear* con-
ditions 5 2 and 6 = 7 . Therefore these conditions amount to 3 purely linear conditions
that can be tested in …nite samples exactly by de…ning
true
H0 : R =q
y+ W y X+ WX
y~ Zy ~
X ZX
The Frisch-Waugh-Lovell (FWL) theorem proposes to calculate the Ordinary Least Squares
(OLS) coe¢ cients obtained from regressing y + on X + , whereas the Generalized Gauss-Markov
(GGM) theorem proposes to calculate the OLS coe¢ cients from regressing y~ on X. ~
(a) (5 marks)
Explain the form of the W matrix and what the FWL theorem achieves in this case.
What are the properties of the OLS coe¢ cients from regressing y + on X + ?
ANSWER:
.
The FWL theorem partitions the regressor matrix into two parts, X = XA ..XB where
XA contains kA regressors and XB the other k kA = kB regressors. Then the OLS
vector of coe¢ ents can be correspondingly partitioned into the estimated coe¢ ents for
the A part, ^ A , and the regressors for the B part, ^ B through the simple formula:
^ = (X +0 X + ) 1
X +0 y +
A
where
y+ W y X+ WX
true
A4 : V Cov( jX) = c2
(obtained either through the Similarity representation using eigenvalues and eigenvectors
or through the LU factorization of Gauss). If the data are transformed by premultiplying
by Z 1=20 to give the tilde quantities,
y~ ~
Zy X ZX
the true error term of the transformed model will have its distribution “rotated” so that
it will satisfy the A4GM assumption:
true
V Cov(Z jX) = Z Z 0 = 1=20 1=2
=I
~ by OLS will give the BLUE since the regular GM theorem applies.
Thus, regressing y~ on X
But of course this OLS estimator corresponds to the IGLS estimator since
~ 0 X)
(X ~ 1 ~ 0 y~ = (X 0 Z 0 ZX)
X 1
X 0 Z 0 Zy = (X 0 1
X) 1
X0 1
y
ys = + xs + (xs ) + s
The sample consists of s = 1; : : : ; S observations with the sample size S being very large. The
sigmoid function ( ) is the cumulative distribution function of a standard N (0; 1) random
variable respectively, with ( 1) = 0, (0) = 0:5, and (+1) = 1. The error disturbance
s is believed to be fully independent from the xs for any observation s, and is distributed
independently and identically over s with mean zero and variance 2 < 1. A researcher, who
is only interested in parameters and , proposes two estimation strategies:
(a) (4 points)
Apply Ordinary Least Squares (OLS) to a regression of ys on a constant and xs .
(b) (6 points)
Ignore the ( ) term and …nd an instrument variable zs for the xs regressor. Then apply
Instrumental Variable Estimation (IVE) to a regression of ys on a constant and xs only,
using zs and the constant as instruments.
Explain carefully why both proposed strategies would be inappropriate, and which estimation
method you would propose instead.
ANSWER:
The problem with approaches (a) and (b) is that they correspond to the regression model
ys = + xs + us
where us = (xs ) + s
The regressor xs is thus endogenous w.r.t. to the composite error us and so approach (a) [OLS
of ys on a constant and xs ] will be biased and inconsistent.
The IVE approach of (b) would be consistent *provided* an instrument variable zs can be found
that is (i) a *valid* instrument in the sense that it is uncorrelated from us and (ii) a *relevant*
instrument in the sense that it is highly correlated with the endogenous regressor xs . Even if the
regressor xs is exogenous w.r.t. the original error term s , it is practically impossible to think of a
variable zs that is both valid and relevant. This is because the endogeneity is due to the (xs )
term inside the composite error us , which is of course correlated with xs .
Yi = yi + wi
Xi2 = xi2 + ui
Xi3 = xi3 + vi
The measurement errors w, u, and v are fully independent from the true variables
y, x2 , x3 , and x4 . Assume that the true regression function is given by:
The true regressor variables x2 , x3 , and x4 are fully independent from the original regression
error term . Note that the variables x3 and x4 enter non-linearly in the model.
Given the imperfect observations on y, x2 , and x3 , the researcher is forced to work with the
equation:
A20 : Yi = 1 + 2 Xi2 + 3 ln(Xi3 + 4 xi4 ) + i
(a) (3 points)
Derive the properties of the error term i.
ANSWER:
Starting from A2 and substituting in the additive measurement error expressions, we
obtain
xi3 + 4 xi4
i = i +wi 2 ui + 3 ln(xi3 + 4 xi4 ) 3 ln(Xi3 + 4 xi4 ) = i +wi 2 ui + 3 ln
Xi3 + 4 xi4
(b) (4 points)
Explain whether or not i has zero expectation and whether or not it is uncorrelated from
the regressor variables.
ANSWER:
n o
xi3 + 4 xi4
E i is not zero since E i = Ewi = Eui = 0 but E ln X i3 + 4 xi4
6= 0 because of the
nonlinearity. The fact that Xi3 = xi3 + vi implies that the ln() term will not have zero
expectation.
The regressor variables in A20 are: Xi2 which appears linearly; and Xi3 and xi4 which
both appear nonlinearly.
6. [20 marks] For the classic linear regression model estimated with data (y; X) with
k regressors and S sample points, consider the Gauss-Markov (GM) theorem:
GM theorem: Assume the following four conditions hold:
A1 : rank(X) = k < S
true true true
A2 : y = X + where E =0
A3Rmi : E( true jX) =E true
Consider the Ordinary Least Squares (OLS) estimator ^ ols = (X 0 X) 1 X 0 y and an arbitrary
Linear Unbiased (LU) estimator ^ LU = BX y, where the k S matrix BX satis…es
true
BX X = Ik . Then the OLS estimator is BLUE for in the sense that their respective
variance-covariance matrices satisfy:
V Cov( ^ lu jX) V Cov( ^ ols jX) V lu V ols is a positive semi-de…nite matrix for any LU
estimator.
(a) (6 points)
Consider the following …ve statements:
i. ^ 2ols + ^ 3ols + ^ 4ols is BLUE for true
2 + true
3 + true
4
ii. ^ 2ols is BLUE for true 2
iii. ^ ^ is BLUE for true true
4ols 5ols 4 5
iv. ^ 2lad + ^ 3lad + ^ 4lad is BLUE for true
2 + true
3 + true
4
v. ^ ^ is BLUE for true true
4lad 5lad 4 5
Explain how the GM theorem result about the positive de…niteness of the matrix di¤erence
V lu V ols above can be readily used to prove the …rst two statements, but not the last
three statements.
Hint: For (i) and (ii), consider how you would calculate var( ^ 2ols + ^ 3ols + ^ 4ols ) and
var( ^ ) from V ols , and how you would calculate var( ^ + ^ + ^ ) and var( ^ )
2ols 2lu 3lu 4lu 2lu
from V lu .
ANSWER:
Consider a nonzero k-dimensional vector c and de…ne the random variable zols = c0 ^ ols .
Use the same vector to de…ne zlu = c0 ^ lu . The variances of the two r.v.s will be respec-
tively:
var(zols ) = c0 V Cov( ^ jX)c var(zlu ) = c0 V Cov( ^ jX)c
ols lu
So var(zlu ) = Cov( ^ lu jX)c var(zols ) = c0 V Cov( ^ ols jX)c if and only if c0 V Cov( ^ lu jX)c
c0 V
c0 V Cov( ^ ols jX)c, OR if and only if c0 [V Cov( ^ lu jX) V Cov( ^ ols jX)]c 0. Since c is
2
CD : Yi = 1 Ki Li 3 i
and
4 4 1= 4
CES : Yi = 1 2 Ki + 3 Li + ui
(a) (4 points)
Show how the researcher can transform the CD production function to yield a linear
regression model. Is a similar approach possible for the CES case?
ANSWER:
Taking logs gives
CD : ln Yi = ln 1 + 2 ln Ki + 3 ln Li + ln i
which is a linear regression model in the logged inputs and an additive logged error term.
There does not exist a transformation that will similarly operate on the CES function to
render it a linear regression, nor any other method exists to achive this. Hence the CES
must be analyzed as an intrinsically nonlinear regression model.
(b) (6 points)
How should the researcher estimate the unknown parameters 1 , 2 , and 3 for the CD
production function, and the 1 , 2 , 3 , and 4 parameters for the CES function?
ANSWER:
The logged CD model can be estimated by OLS to give BLUE estimators for ln 1 , 2 , and
3 under GM conditions. The estimators will also be BUE if the logged disturbance ln i
satis…es A5Gaussian, meaning that the original i was lognormally distributed. Note
that we cannot get BLUE or BUE for 1 , only for ln 1 .
Under A1, A2 : CES, A3Rsru, and A4GM iid, Nonlinear Least Squares (NLLS) will
provide CUAN estimates for the 1 , 2 , 3 , and 4 parameters for the CES function, by
solving:
XS n o2
4 4 1= 4
min ys 1 K
2 i + L
3 i
s=1
(c) (6 points)
ys = x0s + s
y=X +
(a) (7 points)
Assuming that the number of members in each household (ns ) are known, explain which
would be the Best Linear Unbiased estimator (BLUE) for the unknown coe¢ cients . Are
any further assumptions necessary to guarantee that the estimator you propose is BLUE?
What if you wished to derive the Best Unbiased Estimator (BUE) for ?
ANSWER:
Given that the true in the VCov(.) of the error term is known in this case, the Ideal
Generalized Least Squares estimator, given by
(X 0 1
X) 1
X0 1
y
will be the BLUE by the Generalized Gauss Markov theorem, where = diag(1=n1 ; ; 1=ns ; ; 1=n
If we want the BUE, we would need to assume also that A5Gaussian holds for the true
disturbance, since then the IGLS estimator coincides with MLE, and hence it is BUE
because it is unbiased (under A3Rmi or stronger).
(b) (8 points)
Now suppose that the number of members in each household (ns ) are not known. A
colleague suggests that the variance/covariance of the error be modelled instead by:
2 2 2
V Cov( jX) = diag( 1 + 2 x12 ; ; 1 + 2 xs2 ; ; 1 + 2 xS2 )
In other words, the variance of error s is postulated to be a linear function of the square
of that household’s income, where 1 and 2 are unknown parameters. Which estimation
(a) (4 points)
The second regressor is the …rst lag of the dependent variable, i.e., xs2 = ys 1 , and the
symmetric, positive de…nite matrix with all elements being real and …nite, in fact equals
the identity matrix of order S, i.e., = IS .
ANSWER:
A3 needs to be modi…ed to the Weak Exogeneity assumption A3Rsru since the fact that
the lag of the dep.variable is a regressor means that we can no longer condition on the
full X matrix. In view of A4GM now holding, no complication arises and so OLS will
be CUAN. (It will *not* be unbiased since A3Rmi no longer holds.)
(b) (4 points)
The second regressor is the …rst lag of the dependent variable, i.e., xs2 = ys 1 , and the
symmetric, positive de…nite matrix with all elements being real and …nite, has every
o¤-diagonal element di¤erent from zero.
ANSWER:
Now the lagged-y regressor xs2 is *endogenous* w.r.t. to the autocorrelated disturbance
since xs2 depends on the lagged disturbance which is correlated with today’s disturbance
because of the autocorrelation. So OLS will be *inconsistent* in this case.
(c) (4 points)
(
1 if observation s is male
The second regressor is a male dummy, i.e., xs2 = , while
0 otherwise
(
1 if observation s is female
the third regressor is a female dummy xs3 = .
0 otherwise
ANSWER:
Given this de…nition, there is a perfect linear relationship between regressor 2 and regressor
3. since they add up to 1 for any s. This is not necessarily a problem *unless* the RHS
also contains a *constant* regressor — that would lead to perfect multicollinearity among
the regressors and hence A1 will be violated. Without A1, the OLS estimator cannot be