Pages

Saturday 19 April 2014

RCTs: Really Crap Things or Really Cool Things?

Credit: http://cplanicka.blogspot.com.es/
I’ve spent the last couple of months enrolled on two MOOCs (massive open online courses), ‘The Challenges of Global Poverty’, and ‘Evaluating Social Programs’, both effectively run by J-PAL, a research centre at MIT. I know what you’re thinking “Wow there Charlie, this year abroad’s taking you off the RAILS”. Well screw you. Anyway this has meant three things: 1) I’ve been exposed to a LOT of chat about Randomised Control Trials (RCTs); 2) I find the meme on the right much funnier than it actually is; and 3) I kind of wish I were an economist.

I want to talk about the first of those three things. Specifically: what are the downsides of RCTs?

First what they are (and I’m going to try to do this in my own layman words): a randomised control trial is a particular method of impact evaluation for an intervention (eg in development). It gets together a load of people, randomly assigns half of the group (the treatment group) to receive the particular intervention (e.g. access to microfinance) and half (the control group) to not receive the intervention, then at the end of the program measures the difference between treatment and control groups with respect to an appropriate indicator (e.g. income) to get the ‘impact’ of the intervention. Basically, the fact that the two groups were randomly chosen means that they were statistically the same, and so the cause of any difference in outcome between them at the end must be the intervention. Phew.

They’re argued to be pretty good at demonstrating the effectiveness (or not) of programs because they allow you to get to the heart of the causality, getting us closer to the ‘impact seeker’s Nirvana’.

Now after over two months of Esther Duflo and her ‘randomista’ (proponents of randomisation) babes enlightening me about all the cool things they’ve found out using their economist ‘gold standard’ tricks (like if you bribe them with dal, mothers are more likely to get their kids vaccinated. Or providing teenage girls with school uniforms can reduce rates of teen pregnancy more than sex-ed programs), I was ready to read some damning critique. But although RCTs to take a lot of flak, much of the arguments against seem weirdly bad. There’s a bit of a non-debate going on in many quarters.

1. Here’s one: it’s wrong to play God and experiment on people.
Reply: this one’s crap. As Howard White says, “all new policies are essentially experiments”. Now if we knew all the answers in development, then doing RCTs would be, at worst, expensive and useless. We don’t, so trialling and evaluating programmes as best we can to find out what does seems like the opposite of wrong.

2. And another, slightly better: RCTs are immoral because you’re rolling a dice to see who gets the bednet and who doesn’t.
Reply: you always have to choose who receives the intervention and who doesn’t; doing it explicitly and by chance its better than doing it implicitly and by virtue of the distance of the village from the nearest four-star hotel. RCTs never lead to “fewer people getting a service than they would if we haven’t been working on the evaluation” as Rachel Glennester, a randomista, puts it. Also, when you do an evaluation you don’t know whether the intervention works, so you might be better off not receiving it. Which is a bit of a backhanded benefit, but anyway.

Wait, so if the arguments against are rubbish, is this the answer for finding what works in development? Hah no, cause development is BORING, as thinkers ranging from Francisco Toro to my younger brother have argued (with slightly different meanings), and so there’s never one exciting answer, always lots of prosaic partial answers. As Lant Pritchett expresses it “RCTs are one hammer in the development toolkit and previously protruding nails were ignored for lack of a hammer, but not every development problem is a nail”. Some more thorny problems:

3. External validity: you can’t generalise from your RCT because the context matters. Owen Barder: “[in] the obvious example of the de-worming program, it clearly makes sense in communities that suffer from that kind of worm. But you clearly couldn’t generalize to communities where there aren’t worms”. ‘Solutions are slippery’ and you have to acknowledge the ‘hyper-dimensionalityof the design space’. Fuck. Sounds difficult and is. In fact so difficult that they do RCTs to see if the RCTs can be generalised.

4. Too much what not enough why: learning the impact of an intervention doesn’t tell you why it works or not. Also true.

5. Ironically, there is little rigorous evidence to suggest that rigorous evidence is used in policy. Lant Pritchett, chief development troller describes RCT-use as a ‘faith-based activity’. Meanwhile Philip Krause makes the obvious but valid point that “today’s rich countries didn’t get rich by using evidence systematically”.

6. Leaving out the big questions: you can’t run an RCT on whether aid works. You can’t use an RCT to determine what stimulates economic growth. You can’t use an RCT to test the effects of fixed vs. flexible exchange rates. Randomistas would accept this, and perhaps suggest that big ‘macro’ questions are made up of little ‘micro’ questions which they can answer. Debatable. But even WORSE, RCTs can’t answer many ‘micro’ questions either. Ben Ramalingam argues that many of the thorniest problems facing the world today are problems of ‘organised complexity’. With these, all the randomista assumptions about the intervention being the only difference between experiment and control group, linear cause and effect, etc. just don’t hold.

Basically, RCTs are good – so definitely read Poor Economics -... but not that good. According to the hype cycle, they are currently sunning themselves at the Peak of Inflated Expectations, soon to tumble down into the Trough of Disillusionment. Development is boring and there are no quick answers. Hooray!

The hype cycle, credit Gartner
DISCLAIMER: Having said all this, the above issues are slightly more nuanced than a short, deliberately-flippant blogpost would allow. Here are some extra resources (pick a couple) in the unlikely event you want to find out more.

A nice summary piece on RCTs in Slate
An in-depth document explaining RCTs by International Initiative for Impact Evaluation 'Big Cheese' Howard White
Some criticism of RCTs in the New York Times and a response from Jessica Goldberg
A development drums podcast on ‘Randomized Evaluations’
Some great blogposts questioning the relevance and impact of RCTs/raining on the parade
A more detailed look at the ethics of RCTs on the World Bank blog. Plus responses #1 #2 and #3

No comments:

Post a Comment