Instrumental variables: the key to analyzing "natural experiments"?

Submitted by drupaladmin on 22 February 2012.

Inferring causality is hard. Especially in a world where lots of factors, some of them unknown, causally affect the response variable of interest (and each other), and where there are causal feedbacks (mutual causation) between variables. It's even harder when, for whatever reason, you can't do a properly controlled, replicated experiment. What do you do then?

One standard answer is to rely on what Jared Diamond (and probably others) have called "natural experiments".  The basic idea is as follows. If you think that variation in variable A causes variation in variable B, compare the level of B across systems that vary in their level of A. So instead of manipulating A yourself, you're relying on the "manipulations" (variations) in the level of A that nature happens to provide.

Unfortunately, natural experiments are infamously unreliable, not just compared to "real" experiments but in an absolute sense. As my PhD supervisor Peter Morin liked to say, "The problem with natural experiments is that there's no such thing as a natural control." That is, systems that vary in their level of A often vary in lots of other ways as well, some of which probably also affect the level of B. You can of course try to address this by statistically controlling for the levels of those other variables, assuming you can identify them. And you can try to simply collect lots of data from a large range of systems in the hopes that surely some of the among-system variation in variable A will be independent of all confounding variables. And you can try to get rid of any causal feedbacks from B to A by praying to the god of your choice...

Or maybe there's a better way. Economists have to deal with all the same challenges in inferring causality that ecologists do. If anything, economists have it even worse because doing relevant experiments often is harder in economics than it is in ecology. In response, economists have come up with an interesting and potentially-powerful approach to inferring causality from natural experiments, the method of "instrumental variables" (IV).

Here's the basic idea (for details, click the link above, which goes to the very good Wikipedia page on IV). An instrumental variable, call it X, is a variable that causally affects B only via its effect on A, and that is not itself causally affected (directly or indirectly) by B or A. Economists summarize the latter assumption by saying that X is "exogenous". So you can estimate the causal effect of A on B by using, not just any natural variation in A, but only that natural variation in A that can be attributed to natural variation in X. Changes in X are perturbations that propagate to B via only one causal path, that running from A to B, so variation in the instrumental variable X allows you to estimate that strength of that causal path. The approach can be generalized to multiple causal paths, as long as you have multiple instrumental variables.

One thing I find interesting about IV is that they highlight how "more data" is not always helpful. Tempting as it is to think that, if only you had enough data on A from enough different systems, you could reliably infer the causal effect of A on B, it's not true. What you need is not more data on the variability of A, you need the right sort of data on the variability of A (namely, that generated by an instrumental variable). Indeed, more of the wrong sort of data on variability in A can actually be harmful to inferring the effect of A on B.

The nice thing about the IV method is that it doesn't require you to know anything about the rest of the system, such as other variables that might affect B while also covarying with A. All you have to know (and this is the hard part) is that X is what economists call a "good instrument"--that it satisfies the assumptions that make it an instrumental variable.

Which may limit the applicability of IV in ecology. In economics, IV are often policy changes. For instance, an increase in cigarette taxes should affect health only via its effect on how much people smoke. So you can use changes in cigarette taxes to estimate the effect of smoking on health, thereby getting around the fact that lots of factors may affect both health and smoking, and that people's health may affect their inclination to smoke. Weather events like droughts also tend to make good instruments in economics.

I'm unsure whether ecologists will often have good instruments available to them. Weather is exogenous to ecological systems as well as to economic systems. But the problem is that weather changes typically affect any variable of interest via multiple causal pathways. And many policy changes certainly have ecological as well as economic effects. But the problem with many policy changes affecting ecological variables is that they're not exogenous--the policy changes are made in response to observed changes in the variable which the policy change is intended to affect. So if ecologists want to use policy changes as instrumental variables, they may want to focus on policies with unintended ecological consequences. And even there you still might have the problem of unintended consequences propagated via multiple causal paths.  But we won't know if IV can be useful in ecology if we don't try them out.

And if you do try out IV and get them to work, I hope you'll submit the paper to Oikos. ;-)

Categories: 
New ideas

Comments