Syllabus
Course title Nonparametric Econometrics Instructor Robert Lieli, Professor
Email lielir@ceu.edu Office hours by appointment
Credits 2 US credits (4 ECTS credits) Module
Term Spring 2023
Course level MA/Ph.D.
Prerequisites Mathematical Statistics, Econometrics 1-2, programing skills.
Advanced Econometrics 1-2 or Econometric Theory recommended.
Course drop per university/department policy 1. COURSE DESCRIPTION
Content
This econometrics field course is aimed at giving a brief introduction to the statistical theory of nonparametric density and regression function estimation with cross-sectional (i.i.d.) data. The course also covers some basic ideas in machine learning. I discuss several recent applications in the econometrics literature.
Relevance
We are in the midst of a data revolution. The familiarity with various statistical methods and the ability to analyze data has become a critical skill not only in academic research but also in a host of other carrier paths, including policy analysis and many private sector positions.
2. LEARNING OUTCOMES
Key outcomes. By successfully completing the course students will be able to:
• Estimate a probability density function nonparametrically using a kernel density estimator.
• Estimate a regression function nonparametrically using various methods (kernel, local linear, lasso).
• Understand the basic principles underpinning ‘machine learning’ methods and the lasso in particular.
• Understand the bias-variance tradeoff fundamental to all nonparametric methods and what ‘regularization’ means.
• Apply these nonparametric estimation methods in various settings.
Other outcomes. The course will also help develop skills in the following areas.
Learning Area Learning Outcome
Critical Thinking Ability to assess the appropriateness of various statistical methods Quantitative
Reasoning
Assessing the properties of statistical estimators through bias/variance calculations
Technology Skills Improved programming skills (Matlab) Interpersonal
Communication Skills
The ability to use and present data as evidence is a critical communication skill
Management Knowledge and Skills
Not applicable
Cultural Sensitivity and Diversity
Not applicable Ethics and Social
Responsibility
Not applicable
3. READING LIST
Recommended texts:
• [S] Silverman, B.W. (1986): Density Estimation for Statistics and Data Analysis. Chapman and Hall. This book is a classic. Obviously, it does not have the most cutting edge results and applications, but it is still one of the best, easy-to-read, and intuitive introductions into kernel density estimation.
• [PU] Pagan, A. and A. Ullah (1999): Nonparametric Econometrics. Cambridge University Press. A very well-known text that covers many topics. Maybe a bit dated now in terms of applications and breadth of coverage.
• [Su] Su, L. (2011): Brief Introduction to Nonparametric Econometrics. Lecture notes, Singapore Management University. A more technical but also more modern treatment of many different topics by a former classmate. I will make this available as a reader.
• [ISL] James, G., D. Witten, T. Hastie, R. Tibshirani (2017): An Introduction to Statistical Learning. Springer. A very gentle introduction to modern nonparametric (aka ‘machine learning’) methods. The book has a big brother titled ‘The Elements of Machine Learning’
by Hastie, Tibshirani and Friedman.
Articles:
These are required articles that I will discuss mostly during the last two weeks of the course.
• [article 1] Mullainathan, S. and J. Spiess (2017): “Machine Learning: An Applied Econometric Approach,” Journal of Economic Perspectives, 31, pp. 87-106.
• [article 2] Abrevaya, J., Y-C. Hsu and R.P. Lieli (2015): “Estimating Conditional Average Treatment Effects,” Journal of Business and Economic Statistics, 33, pp. 485-505.
• [article 3] Calonico, S., M.D. Cattaneo and R. Titiunik (2014): “Robust data-driven inference in the regression discontinuity design,” The Stata Journal, 14, pp. 909-946.
• [article 4] Belloni, A. V. Chernozhukov and C. Hansen (2014): “High-Dimensional Methods and Inference on Structural and Treatment Effects,” Journal of Economic Perspectives, 28, pp. 29-50.
There are many more recent and advanced treatments by these authors and others. I particularly recommend “Double/debiased machine learning for treatment and structural parameters” in the Econometrics Journal (2018).
4. TEACHING METHOD AND LEARNING ACTIVITIES
This is an econometrics field course designed to give a fairly high-level introduction to nonparametric methods. There are 12 lectures designed to deliver the theoretical part of the material. The lectures are accompanied by three problem sets. These form an integral part of the course; their role is to deepen students’ understanding of the theory as well as to illustrate how to apply the theory in addressing practical problems.
5. ASSESSMENT
Three problem sets, the last of which is a take-home final exam (33%, 33%, 34%). Grading will be based on the total score out of 100, in line with the official CEU grading guidelines.
However, I sometimes deviate from this scale in a direction that favors students in order to compensate for the difficulty of the course.
6. TECHNICAL REQUIREMENTS
I will post assignments, course materials and announcements on Moodle:
https://ceulearning.ceu.edu. It is your responsibility to check this site regularly. The zoom link for following the lectures online will also be posted on Moodle along with the
recordings of the lectures.
The problems require programming skills in Matlab, Stata, R, Python, or some other suitable programming language. I'll provide simple pre-written routines in Matlab to help you with this part of the course (e.g., I have code that implements the kernel density estimator or does local linear regression). These will be directly useful for solving homework problems, but you can also learn general programming tricks from them. I won't provide assistance with any other software or programming language. The bottom line is that solid programming skills are required for the course.
7. TOPIC OUTLINE AND SCHEDULE
Note: the suggested readings may go beyond the lectures. Starred items are required.
Session Topics Readings
1 Parametric vs. nonparametric statistical models. The histogram and the kernel density estimator.
S 1, 2.1-2.4, PU 2.2.1-2.2.3, Su 1.1.1-1.1.2 2 Statistical properties of the kernel density estimator. Bias-
variance tradeoff. Boundary bias.
S 3.1-3.3, PU 2.5, Su 1.1.3
3 Further statistical properties. Bandwidth choice and kernel choice. The multivariate kernel density estimator.
S 3.4.1-3.4.2, 4.1- 4.4, PU 2.4.2-2.4.3, Su 1.1.7
4 Curse of dimensionality. Some direct statistical applications of kernel density estimation.
S 4.5, PU 2.8-2.9, Su 1.2.1-1.2.4, 1.3 5 Nonparametric regression: kernel estimator (Nadaraya-
Watson) and local linear regression estimator. Basic statistical properties.
PU 3.1-3.2, Su 1.4- 1.5
PU 3.3.1-3.3.2, 3.4.1
6 Cross validation (for bandwidth choice and in general) Some direct statistical applications of nonparametric regression.
ISL 5.1, S 3.4.3, Su 1.1.4, 1.4.5
PU 5.1-5.2, Su 1.7 7 Global nonparametric methods and basic ideas in machine
learning I.
ISL 2.1-2.2, article 1*
8 Basic ideas in machine learning II. The lasso. ISL 6.2, 6.4 9 Application: estimating conditional average treatment
effects.
article 2*
10 Application: estimating regression discontinuity models. article 3*
11 Application: using the lasso for causal inference. article 4*
12 Continuing with applications; catching up if there is a delay 8. SHORT BIO OF THE INSTRUCTOR
Robert Lieli earned his Ph.D. in economics in 2004 at the University of California, San Diego.
Before joining CEU full time, he worked at the University of Texas at Austin and the National Bank of Hungary. His research area is econometrics with particular emphasis on
binary prediction, forecasting, and treatment effects. He has published in top economics journals such as the Journal of Econometrics, the Journal of Business and Economic Statistics and the Journal of the European Economic Association.