ECONOMETRICS
ECONOMETRICS
Sponsored by a Grant TÁMOP-4.1.2-08/2/A/KMR-2009-0041 Course Material Developed by Department of Economics,
Faculty of Social Sciences, Eötvös Loránd University Budapest (ELTE) Department of Economics, Eötvös Loránd University Budapest
Institute of Economics, Hungarian Academy of Sciences Balassi Kiadó, Budapest
ECONOMETRICS
Authors: Péter Elek, Anikó Bíró Supervised by: Péter Elek
June 2010
ELTE Faculty of Social Sciences, Department of Economics
ECONOMETRICS
Week 7.
Summary of estimation methods and large sample theory
Péter Elek, Anikó Bíró
Regression model
yi = α + β1x1i + β2x2i +…+ βkxki + ui, i = 1…n
Assumptions 1. E(ui) = 0
2. ui, uj independent for all i≠j
3. xi, uj independent for all i, j (exogeneity) 4. No perfect collinearity
5. Var(ui) = σ2 for all i
6. ui has normal distribution
1–5.: Gauss–Markov conditions
1–6.: Conditions of classical linear model
Assumptions differently (for large sample theory – stochastic explanatory variables)
1. Population model: y = α + β1x1 + β2x2 +…+ βkxk + u.
2. {(x1i,x2i,…,xki,yi), i = 1,…,n} random independent sample of the model.
3. None of the regressors is constant, no perfect collinearity among the regressors.
4. Exogeneity: E(u|x1,…,xk) = 0
5. Homoscedasticity: Var(u|x1,…,xk) = σ2
6. u independent of the regressors, normally distributed.
1–5.: Gauss–Markov conditions
1–6.: Conditions of classical linear model
Multivariate regression model
Estimation: method of moments or OLS (also ML estimation if error term is normal)
Matrix
2 2
2 1
ˆ 1
ˆ,
( ˆ ˆ ˆ ... ˆ )
min
Q i yi x i x i kxkik
) ' ( ) '
ˆ ( X X
-1X y β
u
Xβ
y
Simple regression
i i
i i
i
i i
xx xy
x y
y y
u
x y
x y
S S
ˆ ˆ ˆ
ˆ
ˆ ˆ ˆ
ˆ ˆ ˆ
2 2 22 2 2
y n y
y y
S
y x n y
x y
y x x
S
x n x
x x
S
i i
yy
i i i
i xy
i i
xx
Interpretation of multivariate model
Interpretation of coefficients
Partial effect (“ceteris paribus”): effect of a given regressor on the dependent variable, holding
the other regressors fixed
Coefficient of determination: R
2RSS = S
yy(1 – R
2)
Small sample properties of estimation
If assumptions 1–4 hold: OLS unbiased
If assumptions 1–5 (Gauss-Markov) hold: the estimation is BLUE, and the common formula of variance is correct:
If assumptions 1–6 (classical linear model) hold:
the t- and F-statistic have t- and F-distribution, respectively (any sample size).
i
i RSS
Var
2
ˆ )
(
Multivariate regression, t-test
Two sided test: pl. H0: βi = 0, H1: βi ≠ 0 One sided test: pl. H0: βi = 0, H1: βi > 0
In case of normal error term:
~ 1
ˆ ) ( ˆ
k n i
i
i t
SE
i
i RSS
Var
2
ˆ )
(
ˆ
21
k n
RSS
Simple regression
2
22
2 2
~ /
/ ˆ 1
ˆ
~ ˆ /
ˆ
n xx
n xx
t S
x n
t S
Multivariate regression, F-test
Testing nested hypotheses Testing multiple restrictions
) 1 ( 2 ,
2 2
2 1
0
2 2
) 1 ( 2 ,
2 2
) ~ 1 /(
) 1
( /
0
0 ...
: H
: used be
cannot Regression
) 1
(
) 1
(
) ~ 1 /(
) 1
(
/ ) (
) 1 /(
/ ) (
k n k U
U R
k
U yy
R yy
k n r U
R U
k F n R
k F R
R
R S
URSS R
S RRSS
k F n R
r R
R k
n URSS
r URSS F RRSS
Regression cannot be used:
Analysis of variance
Source of
variance
Sum of squares
Degree of freedom
Mean sum of squares
F Regr. R2Syy k R2Syy/k = MS1 F =
= MS1/MS2
Residual (1 – R2)Syy n – k – 1 (1 – R2)Syy/(n – k – 1) =
= MS2
Total Syy n – 1
Large sample properties I:
consistency
If assumptions 1–4 hold: OLS is consistent. Proof for simple regression
) (
) , (
) (
) ,
ˆ ( plim
ˆ
x Var
u x Cov
x Var
u x
x Cov Var(x)
Cov(x,y) S
S
xx xy
Large sample properties II:
asymptotic normality
If assumptions 1–5 (Gauss–Markov) hold: OLS estimator is asymptotically normal:
Thus the standard deviation goes to zero in order n1/2. The common estimator of σ2 is consistent, therefore the common t-test is asymptotically valid (even if assumption 6 (normality) does not hold)!
) 1
( )
1 ) (
( ˆ
, 0 ˆ ~
2 2
2 2
2
i x
i i
i
asympt i
i
R R
n TSS Var
n c
c N
n
i
Large sample properties III:
F-test and others
If assumptions 1-5 hold (assumption 6 (normality) not needed):
F-test is asymptotically valid.
Other large sample tests (only asymptotically valid):
Wald-test: n(RRSS-URSS)/URSS ~ χr2
regression cannot be used: nR2/(1-R2)~ χk2
Lagrange-multiplicator (LM) test: n(RRSS-URSS)/RRSS ~ χr2 regression cannot be used: nR2 ~ χk2
Model selection
Adjusted R2
Nested hypotheses: t- and F-test
Non-nested hypotheses, same dependent
variable: adjusted R2, information criteria (AIC, BIC – based on log-likelihood)
) 1
1(
1 2 1 R2
k n
R n
Omitting relevant variables
If omitted variable is correlated with included regressors: biased estimation (endogeneity) Simple regression
True model: y = β
1x1 + β2x2+ uEstimated model: y = γ
1x1 + uBias: Corr(x
1,x2)>0 Corr(x
1,x2)<0
β2
>0 +
–β2
<0
–+
Including irrelevant variables
True model: y = β1x1 + β2x2 + u
Estimated model: y= β1x1 + β2x2 + β3x3+ u, β3 = 0
Does not affect unbiasedness (no endogeneity) Variance increases
i
i RSS
2
)
Var(