4. Relationships, causal models
4.1. Crosstabs analysis
Goals
This chapter introduces crosstabs analysis. Learning of this chapter is successful if the Reader is able to do the followings:
- examine a relationship between two categorical variables
- apply Chi-square test on paper and by SPSS and interpret the results.
Definitions
Crosstabs analysis: a method which can be used for measuring relationship between categorical variables
Expected frequencies: hypothetic frequency of a class if the two variables are independent Observed frequencies: frequency of a class in the sample
Learning activities
In order to learn the concept, calculation and interpretation in the topic of crosstabs analysis 1. Read Chapter 17.7 from the book (Page 666-679)!
2. Open and explore 4_1_Crosstabs analysis.ppt!
3. Explore and solve the sample tasks!
4. Check your knowledge: solve the chapter exercises in the book!
Sample tasks
1. In a public opinion research, the opinions about an economic TV program were examined. The following table is known about this research:
Profession Opinion about the TV program
Total
Good Fair enough Bad
Economist Lawyer Other diploma
100 100 100
200 60 60
100 40 40
400 200 200
Total 300 320 180 800
A) Calculate the expected frequencies in the case of independence!
B) Examine the relationship between the profession and the opinion about the TV program (α=0.05)?
C) Calculate and interpret the coefficients measuring the strength of the relationship (if it makes sense calculating them)!
2. Answer the following questions based on the bank.sav file!
A) Create a combination (contingency) table which contains the employees’ profession and gender!
B) Compute the employees’ distribution by gender!
C) Compute the expected frequencies in the case of independence!
D) Is there any relationship between the profession and gender at 5% significance level? If there is, calculate a coefficient measuring the strength of the relationship!
Sample tasks solutions
1. In a public opinion research, the opinions about an economic TV program were examined. The following table is known about this research:
Profession Opinion about the TV program
Total
Good Fair enough Bad
Economist Lawyer Other diploma
100 100 100
200 60 60
100 40 40
400 200 200
Total 300 320 180 800
A) Calculate the expected frequencies in the case of independence!
800 160 program/ the profession and the opinion about the TV program are independent
H1: There is a significant relationship between the profession and the opinion about the TV program/
the profession and the opinion about the TV program are dependent from each other Condition for application: fij*>5
( )
We reject the H0 at 5% significance level, so there is a significant relationship between the profession and the opinion about TV program. .
C) Calculate and interpret the coefficients measuring the strength of the relationship (if there is a sense of calculating them)!
( ) ( )
There is a weak relationship between the profession and the opinion about TV program.
2. Answer the following questions based on the bank.sav file!
A) Create a combination (contingency) table which contains the employees’ profession and gender!
Profession * Gender Crosstabulation
Count
Gender
Total male female
Profession employee with primary
school qualification 27 0 27
employee with secondary
school qualification 157 206 363
employee with tertiary school
qualification 64 9 73
head of divison with secondary school qualification
4 1 5
head of division with tertiary
school qualification 6 0 6
Total 258 216 474
B) Compute the employees’ distribution by gender!
Profession * Gender Crosstabulation
Gender
Total male female
Profession employee with primary
school qualification Count 27 0 27
% within Gender 10,5% 0,0% 5,7%
employee with secondary
school qualification Count 157 206 363
% within Gender 60,9% 95,4% 76,6%
employee with tertiary school
qualification Count 64 9 73
% within Gender 24,8% 4,2% 15,4%
head of divison with secondary school qualification
Count 4 1 5
% within Gender 1,6% 0,5% 1,1%
head of division with tertiary
school qualification Count 6 0 6
% within Gender 2,3% 0,0% 1,3%
Total Count 258 216 474
% within Gender 100,0% 100,0% 100,0%
C) Compute the expected frequencies in the case of independence!
Profession * Gender Crosstabulation
Gender
Total male female
Profession employee with primary
school qualification Count 27 0 27
Expected Count 14,7 12,3 27,0
% within Gender 10,5% 0,0% 5,7%
employee with secondary
school qualification Count 157 206 363
Expected Count 197,6 165,4 363,0
% within Gender 60,9% 95,4% 76,6%
employee with tertiary school
qualification Count 64 9 73
Expected Count 39,7 33,3 73,0
% within Gender 24,8% 4,2% 15,4%
head of divison with secondary school qualification
Count 4 1 5
Expected Count 2,7 2,3 5,0
% within Gender 1,6% 0,5% 1,1%
head of division with tertiary
school qualification Count 6 0 6
Expected Count 3,3 2,7 6,0
% within Gender 2,3% 0,0% 1,3%
Total Count 258 216 474
Expected Count 258,0 216,0 474,0
% within Gender 100,0% 100,0% 100,0%
D) Is there any relationship between the profession and gender at 5% significance level? If there is, calculate a coefficient measuring the strength of the relationship!
Chi-Square Tests
Value df
Asymptotic Significance
(2-sided)
Pearson Chi-Square 79,757a 4 ,000
Likelihood Ratio 97,263 4 ,000
Linear-by-Linear Association 12,786 1 ,000
N of Valid Cases 474
a. 4 cells (40,0%) have expected count less than 5. The minimum expected count is 2,28.
Based on the note below the table categories are needed to be recoded. After recoding:
Profession * Gender Crosstabulation
Gender
Total male female
Profession employee with primary
school qualification Count 27 0 27
Expected Count 14,7 12,3 27,0
% within Gender 10,5% 0,0% 5,7%
employee with secondary
school qualification Count 157 206 363
Expected Count 197,6 165,4 363,0
% within Gender 60,9% 95,4% 76,6%
employee with tertiary school
qualification Count 64 9 73
Expected Count 39,7 33,3 73,0
% within Gender 24,8% 4,2% 15,4%
head of divison Count 10 1 11
Expected Count 6,0 5,0 11,0
% within Gender 3,9% 0,5% 2,3%
Total Count 258 216 474
Expected Count 258,0 216,0 474,0
% within Gender 100,0% 100,0% 100,0%
Chi-Square Tests
Value df
Asymptotic Significance
(2-sided)
Pearson Chi-Square 79,318a 3 ,000
Likelihood Ratio 95,565 3 ,000
Linear-by-Linear Association 11,972 1 ,001
N of Valid Cases 474
a. 0 cells (0,0%) have expected count less than 5. The minimum expected count is 5,01.
Symmetric Measures
Value Approximate Significance
Nominal by Nominal Phi ,409 ,000
Cramer's V ,409 ,000
N of Valid Cases 474
The nullhypothesis of the test is that there is no significant relationship between the profession and gender.
We examine a relationship between two categorical variables, therefore crosstabs analysis can be applied for answering this question.
Based on the note below the table, the application condition was not met in more than 20% of the cells, so categories were recoded. After recoding, the application condition is met.
At a 5% significance level, we reject the nullhypothesis (Pearson Chi Square sig<0.05), so there is a significant relationship between the profession and gender. The relationship between the examined variables has a medium strength (C=0.409).
The ratio of employees with tertiary education qualification is 24.8% among male respondents. The ratio of employees with secondary education qualification is 95.4% among female respondents.
Within males, the employees with tertiary education qualification is typical, and within female the employees with secondary education is typical.