Instrumental Variable

… a core concept in Quantitative Methods and Atlas104

Concept description

Statistics How To (reference below) defines instrumental variable (sometimes called an instrument variable) is “a third variable, Z, used in regression analysis when you have endogenous variables – variables that are influenced by other variables in the model” and notes:

“In other words, you use it to account for unexpected behavior between variables. Using an instrumental variable to identify the hidden (unobserved) correlation allows you to see the true correlation between the explanatory variable and response variable, Y.

Statistics How To provides the example:

“Let’s say you had two correlated variables that you wanted to regress: X and Y. Their correlation might be described by a third variable Z, which is associated with X in some way. Z is also associated with Y but only through Y’s direct association with X. For example, let’s say you wanted to investigate the link between depression (X) and smoking (Y). Lack of job opportunities (Z) could lead to depression, but it is only associated with smoking through it’s association with depression (i.e. there isn’t a direct correlation between lack of job opportunities and smoking). This third variable, Z (lack of job opportunities), can generally be used as an instrumental variable if it can be measured and it’s behavior can be accounted for.

The University of Manitoba’s Faculty of Health Sciences (reference below) has this helpful description of instrumental variables as they apply to the field of medicine:

“Instrumental variables (IVs) are used to control for confounding and measurement error in observational studies. They allow for the possibility of making causal inferences with observational data. Like propensity scores, IVs can adjust for both observed and unobserved confounding effects. Other methods of adjusting for confounding effects, which include stratification, matching and multiple regression methods, can only adjust for observed confounders. IVs have primarily been used in economics research, but have recently begun to appear in epidemiological studies. Observational studies are often implemented as a substitute for or complement to clinical trials, although clinical trials are the gold standard for making causal inference. The main concern with using observational data to make causal inferences is that an individual may be more likely to receive a treatment because that individual has one or more co-morbid conditions. The outcome may be influenced by the fact that some individuals received the treatment because of their personal or health characteristics.

“Let Z denote a randomization assignment indicator variable in this regression model, such that Z = 1 when a treatment is received and Z = 0 when the control or placebo is received, and let X1 be the treatment. Z is referred to as the instrumental variable because it satisfies the following conditions:

  1. Z has a casual effect on X
  2. Z affects the outcome variable Y only through X (Z does not have a direct influence on Y which is referred to as the exclusion restriction)
  3. There is no confounding for the effect of Z on Y.

“There are two main criteria for defining an IV:

  1. It causes variation in the treatment variable
  2. It does not have a direct effect on the outcome variable, only indirectly through the treatment variable.”

“A reliable implementation of an IV must satisfy these two criteria and utilize a sufficient sample size to allow for reasonable estimation of the treatment effect. If the first assumption is not satisfied, implying that the IV is associated with the outcome, then estimation of the IV effect may be biased. If the second assumption is not satisfied, implying that the IV does not affect the treatment variable then the random error will tend to have the same effect as the treatment. When selecting an IV, one must ensure that it only affects whether or not the treatment is received and is not associated with the outcome variable.

“Although IVs can control for confounding and measurement error in observational studies they have some limitations. We must be careful when dealing with many confounders and also if the correlation between the IV and the exposure variables is small. Both weak instruments and confounders produce large standard error which results in imprecise and biased results. Even when the two key assumptions are satisfied and the sample size is large, IVs cannot be used as a substitute for the use of clinical trials to make causal inference, although they are often useful in answering questions that an observational study can not. In general, instrumental variables are most suitable for studies in which there are only moderate to small confounding effects. They are least useful when there are strong confounding effects.

“In economics, IVs are used to determine which factors influence demand without affecting cost, or the factors that influence cost without affecting demand. In the discipline of epidemiology, primarily with respect to natural experiments, IVs are used (1) to counteract issues with measurement error in explanatory variables which result from a lack of accurate information available for analysis and (2) to overcome the issue of omitted variables in order to make casual inference in observational studies when randomization is infeasible or unethical. The fewer the number of instruments incorporated into the model, the smaller the bias. If the number of instruments is equivalent to the number of treatment or endogenous variables then the bias is approximately zero.”


Statistics How To, Instrumental Variable – Definition & Overview, at, accessed 12 May 2018.

University of Manitoba Faculty of Health Sciences handout, An Introduction to Instrument Variables, at, accessed 12 May 2018.

Topic, subject and Atlas course

Research Design (core topic) in Quantitative Methods and Atlas104.

Page created by: Alec Wreford and Ian Clark, last modified 12 May 2018.

Image: Bazyli, SlideServe at, accessed 12 May 2018.