• Nem Talált Eredményt

ECONOMIC STATISTICS

N/A
N/A
Protected

Academic year: 2022

Ossza meg "ECONOMIC STATISTICS"

Copied!
13
0
0

Teljes szövegt

(1)

ECONOMIC STATISTICS

Sponsored by a Grant TÁMOP-4.1.2-08/2/A/KMR-2009-0041 Course Material Developed by Department of Economics,

Faculty of Social Sciences, Eötvös Loránd University Budapest (ELTE) Department of Economics, Eötvös Loránd University Budapest

Institute of Economics, Hungarian Academy of Sciences Balassi Kiadó, Budapest

(2)

2

Author: Anikó Bíró Supervised by Anikó Bíró

June 2010

Week 3

Correlation, simple regression – introduction Example from week 2

Negative relationship between two variables – point diagram (KSH):

0 2 4 6 8 10 12 14 16 18 20

0 1000 2000 3000 4000 5000 6000

GDP/capita (th HUF)

Unemployment rate (%)

(3)

3

Correlation

• Relationship between two variables numerically

Notation: correlation between X and Y is rXY

• Square of correlation (rXY2): what percentage of Y’s variation is explained by X = what percentage of X’s variation is explained by Y

Supplement: form ula of correlation

N

i

i N

i

i N

i

i i

X X

Y Y

X X

Y Y r

1

2 1

2 1

) (

) (

) )(

(

Properties of correlation

• Value between -1 and 1

• Positive value – positive relationship. r=0: no correlation between the variables

• Larger positive value – stronger positive relationship

• Correlation between X and Y = correlation between Y and X

• Correlation of a variable with itself = 1

• Correlation with a constant = 0

(4)

4

Example

Correlation between unemployment rate and GDP/capita = -0,62

• Negative relationship

• Higher GDP/capita – lower unemployment

• The standard deviation of county level GDP/capita explains 38% of the standard deviation of unemployment rate (0,62*0,62=0,384)

Causality?

• Does one variable ”cause” the other?

• Correlation does not reveal the direction of causality

• There might be no causality at all

• Previous examples? (GDP – unemployment, GDP – number of enterprises)

Correlation between more variables

• M variables – M(M-1)/2 correlations

• Correlation matrix of 3 variables (X, Y, Z):

(5)

5

X Y Z

X 1

Y rXY 1

Z rXZ rZY 1

Strength of relationship graphically

• Point diagram between two variables

• See: textbook graphs

• ”How difficult is to draw a line fitting the points?”

• ”How scattered are the points?”

Correlation vs. regression

• Numerical analysis of the relationship between variables:

• Correlation:

• Between 2 variables

• Causality?

• Regression:

• Complex relationships (more variables)

• There might be an underlying economic model – causality

• Examples: wage regression (education, …), inflation regression (interest rate,

…)

(6)

6

Simple (univariate) regression

• Y dependent variable, X explanatory variable (regressor)

• Assumption: linear relationship

• Regression line:

X Y

• Reality: the data do not fit a line

Example: advertisement expenditures

470 480 490 500 510 520 530 540 550 560

0 20 40 60 80 100

Advertisements (th USD)

Sales (th USD)

(7)

7

Error term

• Linear regression: no perfect fit

• Omitted, unobservable variables

• Not linear relationship

• Regression model with error term:

e X

Y     

• Error term (disturbance): distance between the data point and the regression line

• Causality (model)? Generalization of correlation?

Estimation

• Unknown value of coefficients

• Estimated coefficients: coefficients of the best fitting line

• Notation:

 ˆ , ˆ

• Residual:

e u

u X

Y

  ˆˆ

(8)

8

OLS estimation

• Best fitting line – minimal sum of square of residuals

 

N

i

u i

SSR

1

2

• Ordinary least squares (OLS) estimation

Advertisement example, cont.

• Estimated coefficients:

• 502,92 – intercept parameter (constant);

• 0,22 – coefficient of advertisements (slope)

• Interpretation?

• Slope:

• Average change in Y if X increases by one unit

• Marginal effect

Summary

• Correlation:

• Strength of relationship between two variables

• Properties of correlation

• Interpretation: square of correlation

• Linear regression (univariate):

(9)

9

• Underlying economic model

• Error term

• Residual

• Estimation: OLS

Correlation, simple regression – introduction Seminar 3

Correlation

• Relationship between two variables numerically

• Square of correlation (rXY2): what percentage of Y’s variation is explained by X = what percentage of X’s variation is explained by Y

• Excel: CORREL() function

Properties of correlation

• Value between -1 and 1

• Positive value – positive relationship. r=0: no correlation between the variables

• Correlation between X and Y = correlation between Y and X

• Correlation of a variable with itself = 1

Examples

Correlation and squared correlation?

(10)

10

• KSH county level data: unemployment and GDP/capita?

• KSH county level data: GDP/capita and number of registered enterprises?

• MNB: HUF/EUR and HUF/USD?

Correlation between more variables - example

• European sample, women aged 50+ (SHARE subsample)

• Education (0–4 scale)

• If ever smoked daily

• Malignant tumor (cancer)

• Qualitative data

• What kind of correlation expected?

Example, cont.

• Immediate causality: smoking – cancer

• Proximate causality: higher education level – cancer

(11)

11

Educ. Smoke Cancer

Educ. 1

Smoke 0,18 1

Cancer 0,01 0,04 1

Simple (univariate) regression

• Y dependent variable, X explanatory variable

• Assumption: linear relationship

• Regression line:

X Y    

• Error term vs. residual

Example: advertisement expenditures

Koop: Advert.xls file

• Correlation?

(12)

12

• Point diagram

• Regression line with Excel: Diagram/Trend line

OLS estimation

• Best fitting line – minimal sum of square of residuals

 

N

i

u i

SSR

1

2

• Ordinary least squares (OLS) estimation

• Excel: Data analysis/Regression – estimate and interpret coefficients of the advertisement examples

• Sensitivity of the coefficients to scaling (unit of measurement)?

(13)

13

Further examples

• KSH county level data: regression of unemployment rate on the number of registered enterprises

Y: unemployment X: enterprises

• Forest.xls: effect of population growth on deforestation?

Homework 2 (individual)

Choose 3 variables from a database among which correlation is expected

• What kind of relationship is expected? Explain.

• Descriptive statistics (graphical + numerical)

• Analysis of correlations

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

I would like to introduce an example of social innovation in Hungary which led to employment expansion and decreased the high unemployment rate on a settlement level.

In terms of changes in wages, we find that the increase in wages for workers affected by the reform is higher in local industries with a higher degree of strategic unemployment

The highest level of registered unemployment remains in Latgale: Ludza and Rezekne Districts (23.3% and 22.2% respectively). Guest workers on Latvia’s labour market.

The lower unemployment level registered at the end of 2002 is a result of the new measures implemented by the Ministry of Labor and Social Policy and of the social policy in the

Considering the level and the recent developments of the unemployment rate in Eastern Europe, in our opinion it is not recommendable to apply a progressive taxation, because

Firms are interested in creating as many vacancies as required to keep unemployment constant for a given level of job destruction.. Right of the

– the companies increase wages to avoid employees who are not performing well and thus provide more motivation – If the unemployment rate is high, wages play less significant

In 2000 the employment rate of women was 49.7% and that of men 63%, the unemployment rate 5.6% and 7%, respectively. There has been a change of attitude towards women staying at