ECONOMETRICS
ECONOMETRICS
Sponsored by a Grant TÁMOP-4.1.2-08/2/A/KMR-2009-0041 Course Material Developed by Department of Economics,
Faculty of Social Sciences, Eötvös Loránd University Budapest (ELTE) Department of Economics, Eötvös Loránd University Budapest
Institute of Economics, Hungarian Academy of Sciences Balassi Kiadó, Budapest
ECONOMETRICS
Authors: Péter Elek, Anikó Bíró Supervised by: Péter Elek
June 2010
ELTE Faculty of Social Sciences, Department of Economics
ECONOMETRICS
Week 6.
Multivariate regression III
Péter Elek, Anikó Bíró
Content
F-test (cont.), Stability tests
Adjusted R
2, model selection
Dummy variables
F-test more generally
Joint test of r constraints in a regression with k explanatory variables
Nested hypotheses: the parameter set of the model is a subset of that of the original one
Example
U: y = α + β1x1 + β2x2 + β3x3 + u H0: β2 = 0, β3 = 0
R: y = α + β1x1 + v
F-test, test statistic
Sum of squares decompositon
Degrees of freedom
in large samples, approximately the Wald-test)
F = (RU
2 – RR2) / r
~ F r,(n – k –1)
(1 – RU2) / (n – k – 1)
TSS = RESS + RRSS = RESS + (RRSS – URSS) + URSS
n – 1 = (k – r) + (n – k + r – 1) = (k – r) + r +(n – k – 1) F = (RRSS – URSS) / r
~ F r,(n – k –1)
URSS /(n – k – 1)
~ cr2/ r,
RRSS = Syy (1 – RR2) URSS = Syy (1 – RU2)
Testing a linear function of the parameters
Example: Cobb-Douglas production function logX= α + β1logL + β2logK + u
H0: β1 + β2 = 1
t-test: θ = β1 + β2 β2 = θ – β1
logX = α + β1(logL – logK)+ θ logK + u H0: θ = 1
t-test directly on β1 + β2 , using that the variance of the sum is:
Var(β1^ + β2^) = Var(β1^) + Var(β2^)+ 2cov(β1^,β2^) F-test: β2 = 1 - β1
R: logX – logK = α + β1 (logL – logK) + u
Stability test: two independent data sets (sometimes referred to as Chow-test)
1. yi = α1 + β11x1i + β21x2i +…+ βk1xki + ui, i = 1…n1 2. yi = α2 + β12x1i + β22x2i +…+ βk2xki + vi, i = 1…n2 H0: α1 = α2, β11 = β12, …, βk1 = βk2
RRSS: from the merged data set,
RSS1, RSS2: from separate regressions F = (RRSS – RSS1 – RSS2) / (k + 1)
~ F k + 1, n1 + n2 – 2k – 2
(RSS1 + RSS2) / (n1 + n2 – 2k – 2)
Stability test – Chow-test (predictive)
1. yi = α1 + β11x1i + β21x2i +…+ βk1xki + ui, i = 1…n1 2. yi = α2 + β12x1i + β22x2i +…+ βk2xki + vi, i = 1…n
n – n1 < k + 1 is possible (in contrast to the previous one)
RSS1: res. sum of squares based on the first n1 observations RRSS: res. sum of squares based on the model estimated
from all (n = n1 + n2) observations
F = (RRSS – RSS1) / (n – n1)
~ F (n – n1),(n1 – k – 1)
RSS1 / (n1 – k – 1)
Adjusted R 2
Including new variables: RSS and the degree of freedom are both decreasing (the number of normal equations is increasing)
Adjusted R2:
t < 1: omitting a variable: is increasing F< 1: omitting more variables: is increasing Possible: different conclusions based on t and F (e.g. multicollinearity)
ˆ 2 RSSdf
) 1
1(
1 2 1 R2
k n
R n
R2
R2
Model selection
Nested hypotheses: t- and F-test
Non-nested hyp., dependent variable is the same, e.g.:
R&D = α + β log(revenue) + u
R&D = α + β1 revenue + β2 revenue2 + u
based on adjusted R2 or information criteria (e.g. AIC)
AIC (Akaike information criterion):
RSS∙exp(2(k + 1)/n)
Adjusted R 2 , example
Wage survey (2003): does the experience or the age explain more in the wage equation?
Logarithmic forms
Log-log (loglinear) – elasticity ln(y)= α + βln(x) + u
Partly logarithmic forms
e x x
y
ˆ 1 % ˆ %
%
ˆ
x y
u x
y
x y
u x
y
%
%
100 ˆ ) ˆ
ln(
100 ˆ ˆ
) ln(
Quadratic form
Increasing or decreasing partial effect
Example: wage survey (2003), quadratic function of experience, estimated equations:
log(Ker) = 9.83 + 0.135 ISKVEG9 + 0.0082 EXP
log(Ker) = 9.83 + 0.135 ISKVEG9 + 0.022 EXP – 0.00029 EXP2
positive (but decreasing) partial effect for 0.022/(2*0.00029) = 39 years
1 2 1
1
2 1 2 1
1
2 ˆ ˆ ˆ
x x y
u x
x y
Interactions
Partial effect depends on other explanatory variables as well:
Example: wage and education premium depend on the
profitability of the firm (net sales revenue – material costs)
Log(wage) = 10.304 + 0.139 Educ9 + 0.092 Log(Profit)
Log(wage) = 10.597 + 0.079 Educ9 + 0.043 Log(Profit) + 0.010 (Educ9*Log(Profit))
2 2 1
1
2 1 2 1
1
ˆ ˆ ˆ
x x y
u x
x x
y
Dummy variables on the right hand side
So far: mainly continuous variables
(quantitative information) – e.g. wage, consumption, wealth, education (?)
Binary / dummy variables
Qualitative information
Examples: gender, employed, country dummy…
Different intercepts – 2 groups
Example:
otherwise 1
Budapest, from
if 0
, )
( )
age log(
otherwise ,
Budapest from
if ) ,
log(
1 2
1 2 1
i i
i i
i i
i i
i i
i
D D
u Educ
D w
u Educ
u wage Educ
Different intercepts, example
Based on the 2003 wage survey
Log dependent variable
Estimated equation:
Countryside: lower wage by approx. 16% (ceteris paribus)
Exact difference („log” is the natural logarithm in Eviews):
i i
i Countryside Educ
Wage ) 10.93 0.16 0.15
log(
1
100 14.79: difference wage
%
log 16
. 0 )
log(
) log(
16 . 0
0 1 0
1
e
Wage Wage Wage
Wage
More than two groups
N groups (e.g. regions instead of Budapest / countryside)
N-1 dummies in the regression (if there are N groups), Group N: benchmark group!
otherwise) (0
N Group in
1
..., , otherwise) (0
2 Group in
1
) (
...
) (
N Group in
2 Group in
1 Group in
2
1 2
1 2
1 2
1
N
i i
N N
i
i i
N
i i
i i
i
D D
u x
D D
y
u x
u x
u x
y
Interactions between binary variables
Example: male / female wage gap is not the same in Budapest and in the countryside
Four categories:
Benchmark group: females in Budapest Two equivalent models
Budapest countryside
female
male
i i
i i
i i
i
i i
i i
i i
u Educ
Male e
Countrysid e
Countrysid Male
wage
u Educ
Male e
Countrysid Male
Bp Fem
e Countrysid wage
3 2
1 0
3 2
1 0
) log(
_ _
_ )
log(
Wage survey estimates (benchmark: females in Bp.):
–0,1026 = –0.1726 + 0.0540 + 0.0165
Interactions, example
Non-constant slope parameters
In case of two groups
→ Interaction of a dummy variable and an explanatory variable
2 Group for
1 1,
Group for
0
) (
2 Group for
1 Group for
2 2 1
11 12
1 11
2 2 1
12
2 2 1
11
i i
i i
i i i
i
i i
i
i i
i i
D D
u x
x D x
y
u x
x
u x
y x
Example
Effect of education gender-dependent but the effect of age is not
i i
i i
i
i Educ Educ Male Age u
Wage ) 0 1 2 3
log(
Examining the stability of the coefficients
Example: is the wage model the same for males and females?
Cross sectional analysis (time series: stability of the coefficient in time)
F-test (also possible for a subset of the restrictions) Problem if N2 < k → Chow-test (predictive) can be used (see week 5)
0 ,
0 ,
0 :
H
) log(
3 2
2 0
4 3
2 1
2 1
i i i i i i i i
i Male Educ Educ Male Age Age Male u
Wage
Estimation results and test statistic
Examining the stability of the coefficients, cont.
Dummy dependent variable
Examples:
Labour market: employed or not
Consumption: real estate owner or not
Finance: bankruptcy of the borrower or not
Binary dependent variable ↔ linear model?
Linear probability model I.
y binary variable
Nonlinear model:
If F is the Gaussian distribution function then the probit, if F(z) = ez/(1 + ez) then the logit model is obtained.
prob.
estimated 1
ˆ :
: model Estimated
)
| 1 (
)
| (
y
y
u x
y
x x
y P x
y E
i i
i
) (
)
| 1
(y x F x
P
Linear probability model II.
Problem 1: estimated probability may lie outside the [0,1]
interval
-0.2 0 0.2 0.4 0.6 0.8 1 1.2
Linear probability model III.
Problem 2: heteroscedasticity
Solution:
Using robust SE Weighted LS:
ˆ ) 1 ˆ ( )
- (1 )
- 1 ( )
)(
- (1 )
Var(
prob.
1
prob.
) -
(1
0 )
( ,
2 2
i i
i i
i i
i i
i
i i
i i
i
i i
i i
y y
x x
x x
x x
u
x x
x u x
u E u
x y
ˆ ) 1
ˆi ( i
i y y
w
Linear probability model, example
0 1,000 2,000 3,000 4,000 5,000 6,000
-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
Frequency
INSF
Dependent variable: whether the person has a private health insurance or not (SHARE database)
Explanatory variables: wealth, income, age, education, country dummies Histogram of predicted prob.
Seminar
Multivariate regression III.
Exercise: estimation of a wage equation based on a small sample from the wage survey
Variables
Educ (years of education) Exp (experience)
Wage
Typ (type of settlement – qualitative variable) Bp (Budapest dummy)
Male (male dummy)
Estimation of the wage equation I.
Model 1: modelling log(wage) in the private sector with the educ, exp, exp2, bp, male variables and with the interaction of educ, exp with male
Does the equation for males differ significantly from the equation for females?
Joint test of Male, Educ*Male and Exp*Male
Experience-profile for Budapest males with 12 years of experience
Where is the maximum?
Graphical presentation of the experience-profile with confidence interval
Estimation of the wage equation II.
Model 2: previous model + dummies for “chief town of the county” and for “other town”
Testing the equality of the two new coefficients with three methods
Directly
By a t-test after transformation
By comparing the R2 of the restricted and unrestricted model
Testing heteroscedasticity with the White- and Breusch–Pagan-tests
Calculating robust standard errors and comparing them with the non-robust ones