Omitted Variable Bias
ThoughtCo (reference below) defines omitted variable bias (or omitted variables bias) as “bias that appears in an estimate of a parameter if the regression run does not have the appropriate form and data for other parameters.”
“For example, many regressions that have wage or income as the dependent variable suffer from omitted variables bias because there is often no practical way to add in a worker’s innate ability or motivation as an explanatory variable.
“As a result, the estimated coefficients on variables such as education as likely to be biased because of the correlation between educational attainment and unobserved ability. If the correlation between education and unobserved ability is positive, omitted variables bias will occur in an upward direction. Conversely, if the correlation between an explanatory variable and an unobserved relevant variable is negative, omitted variables bias will occur in a downward direction.”
This is elaborated in a handout from PAD705 at Rockefeller College (reference below):
“Omitted variable bias (OVB) is one of the most common and vexing problems in ordinary least squares regression. OVB occurs when a variable that is correlated with both the dependent and one or more included independent variables is omitted from a regression equation. Let’s think about salary and education; our regression equation is:
Salaryi = β0 + β1educationi + εi
“In this case, our included independent variable is “education.” However, salary is also likely to be related to innate ‘ability’, which has been excluded (possibly because there is no good way to measure it). Ability, we would expect, is also related to the amount of education a person chooses to get – those with greater ability seek more education.
“If we omit a variable that is correlated with both an included independent and the dependent variable … our regressions will “fit” less well. The “amount” that the error term gets larger depends on the relationship between the omitted variable and both the included independent variable and the dependent variable. Now, recall that this omitted variable and included independent variable are correlated with one another – as the omitted variable gets bigger, the included independent gets bigger, or if the omitted variable gets smaller then the included independent variable gets smaller (assuming positive correlation). … Hence, there will be correlation between the included independent variable and the error term, creating bias.
“In practical terms, the requirement that we include all variables that are correlated to both our independent variables and our dependent variable places a heavy burden on our data collection methods. If we wish to know about the relationship between salary and education, for instance, we must be sure to include all variables that could be correlated with both education and salary.”
ThoughtCo, Defining Omitted Variables Bias, at https://www.thoughtco.com/defining-omitted-variables-bias-1146179, accessed 12 May 2018.
Rockefeller College, University at Albany, PAD705 Handout: Omitted Variable Bias, at https://www.albany.edu/faculty/kretheme/PAD705/SupportMat/OVB.pdf, accessed 12 May 2018.
Topic, subject and Atlas course
Page created by: Alec Wreford and Ian Clark, last modified 12 May 2018.
Image: Bazyli, SlideServe at https://www.slideserve.com/bazyli/3-3-omitted-variable-bias, accessed 12 May 2018.