Summary of estimation methods and large sample theory

(1)

ECONOMETRICS

Sponsored by a Grant TÁMOP-4.1.2-08/2/A/KMR-2009-0041 Course Material Developed by Department of Economics,

Faculty of Social Sciences, Eötvös Loránd University Budapest (ELTE) Department of Economics, Eötvös Loránd University Budapest

Institute of Economics, Hungarian Academy of Sciences Balassi Kiadó, Budapest

(2)

2

Authors: Péter Elek, Anikó Bíró Supervised by Péter Elek

June 2010

Week 7

Summary of estimation methods and large sample theory

Regression model

yi = α + β1x1i + β2x2i +…+ βkxki + ui, i = 1…n Assumptions

1. E(ui) = 0

2. ui, uj independent for all i≠j

3. xi, uj independent for all i, j (exogeneity) 4. No perfect collinearity

5. Var(ui) = σ² for all i 6. ui has normal distribution 1–5.: Gauss–Markov conditions

1–6.: Conditions of classical linear model

(3)

3

Assumptions differently (for large sample theory – stochastic explanatory variables)

1. Population model: y = α + β1x1 + β2x2 +…+ βkxk + u.

2. {(x1i,x2i,…,xki,yi), i = 1,…,n} random independent sample of the model.

3. None of the regressors is constant, no perfect collinearity among the regressors.

4. Exogeneity: E(u|x1,…,xk) = 0

5. Homoscedasticity: Var(u|x1,…,xk) = σ²

6. u independent of the regressors, normally distributed.

1–5.: Gauss–Markov conditions

1–6.: Conditions of classical linear model

Multivariate regression model

Estimation: method of moments or OLS (also ML estimation if error term is normal)

Matrix

2 2

2 1

ˆ 1

ˆ,

( ˆ ˆ ˆ ... ˆ )

min Q

_i

y

_i

x

_i

x

_i _k

x

_ki

k





^  ^ ^ ^ ^ ^

) ' ( ) '

ˆ ( X X

^-1

X y β

u

Xβ

y   

(4)

4

Simple regression

Interpretation of multivariate model

Interpretation of coefficients

Partial effect (“ceteris paribus”): effect of a given regressor on the dependent variable, holding the other regressors fixed

Coefficient of determination: R²

RSS = Syy(1 – R²)

i i

i i i

i i

xx xy

x y

y y u

x y

S S





































ˆ ˆ ˆ

ˆ

ˆ ˆ ˆ

 

  

 

² ² ²

2 2 2

y n y y

y S

y x n y x y

y x x S

x n x x

x S

i i

yy

i i i

i xy

i i

xx



























 



i i

i i i

i i

xx xy

x y

y y u

x y

S S





































ˆ ˆ ˆ

ˆ

ˆ ˆ ˆ

 

  

 

² ² ²

2 2 2

y n y y

y S

y x n y x y

y x x S

x n x x

x S

i i

yy

i i i

i xy

i i

xx



























 



(5)

5

Small sample properties of estimation

If assumptions 1–4 hold: OLS unbiased

If assumptions 1–5 (Gauss-Markov) hold: the estimation is BLUE, and the common formula of variance is correct:

If assumptions 1–6 (classical linear model) hold:

the t- and F-statistic have t- and F-distribution, respectively (any sample size).

Multivariate regression, t-test

Two sided test: pl. H0: βi = 0, H1: βi ≠ 0 One sided test: pl. H0: βi = 0, H1: βi > 0

In case of normal error term:

Simple regression

i

RSS

Var

2

ˆ )

(   

~

1

ˆ ) ( ˆ



k n i

i

t

SE 



i

RSS

Var

2

ˆ )

(   

ˆ

²

1 

 

k n

 RSS



²



²

2

2 2

~ /

/ ˆ 1

ˆ

~ ˆ /

ˆ







n xx

t S

x n

t S









(6)

6

Multivariate regression, F-test

Testing nested hypotheses Testing multiple restrictions

Analysis of variance

) 1 ( 2 ,

2 2

2 1 0

2 2

) 1 ( 2 ,

2 2

) ~ 1 /(

) 1

( /

0 0 ...

: H

: used be cannot Regression

) 1

(

) 1

(

) ~ 1 /(

) 1

(

/ ) (

) 1 /(

/ ) (



 













 



 

k n k U

U R

k

U yy

R yy

k n r U

R U

k F n R

k F R

R

R S

URSS R

S RRSS

k F n R

r R R k

n URSS

r URSS F RRSS



Source of variance

Sum of squares

Degree of freedom

Mean sum of squares

F Regr. R

²

S

yy

k R

²

S

yy

/k = MS

1

F =

= MS

1

/MS

2

Residual (1 – R

²

)S

yy

n – k – 1 (1 – R

²

)S

yy

/(n – k – 1) =

= MS

2

Total S

yy

n – 1

(7)

7

Large sample properties I:

consistency

If assumptions 1–4 hold: OLS is consistent. Proof for simple regression

Large sample properties II: asymptotic normality

If assumptions 1–5 (Gauss–Markov) hold: OLS estimator is asymptotically normal:

Thus the standard deviation goes to zero in order n^1/2.

The common estimator of σ² is consistent, therefore the common t-test is asymptotically valid (even if assumption 6 (normality) does not hold)!



 









 

 



) (

) , (

) (

) ,

ˆ ( plim

ˆ

x Var

u x Cov

x Var

u x x

Cov Var(x)

Cov(x,y) S

S

xx xy

  ^{ }

) 1

( )

1 ) (

( ˆ

, 0 ˆ ~

2 2

2

i x

i i

i asympt i

i

R R

n TSS Var

n c

c N n

i



 













 



(8)

8

Large sample properties III:

F-test and others

If assumptions 1-5 hold (assumption 6 (normality) not needed): F-test is asymptotically valid.

Other large sample tests (only asymptotically valid):

Wald-test: n(RRSS-URSS)/URSS ~ χr2

regression cannot be used: nR²/(1-R²)~ χk2

Lagrange-multiplicator (LM) test: n(RRSS-URSS)/RRSS ~ χr2

regression cannot be used: nR²~ χk2

Model selection

Adjusted R²

Nested hypotheses: t- and F-test

Non-nested hypotheses, same dependent variable: adjusted R², information criteria (AIC, BIC – based on log-likelihood)

Omitting relevant variables

If omitted variable is correlated with included regressors: biased estimation (endogeneity)

Simple regression

True model: y = β1x1 + β2x2+ u Estimated model: y = γ1x1 + u

Bias: Corr(x1,x2)>0 Corr(x1,x2)<0

β2 >0 + –

β2 <0 – +

) 1

1 (

1

²

1 R

²

k n

R n 



 



(9)

9

Including irrelevant variables

True model: y = β1x1 + β2x2 + u

Estimated model: y= β1x1 + β2x2 + β3x3+ u, β3 = 0 Does not affect unbiasedness (no endogeneity) Variance increases

Summary of estimation methods and large sample theory

ECONOMETRICS

Authors: Péter Elek, Anikó Bíró Supervised by Péter Elek

June 2010

Week 7