Randomised controlled trials (RCTs) are the best way of determining whether a policy is working. They are now used extensively in international development, medicine, and business to identify which policy, drug or sales method is most effective. They are also at the heart of the Behavioural Insights Teamʼs methodology. However, RCTs are not routinely used to test the effectiveness of public policy interventions in the UK. We think that they should be.(HT: Tim Harford's twitter feed)
15 June 2025
A quantitative history of RCTs
08 May 2025
OMG Millennium Villages Increase Poverty ROFL!!
For 14 of 18 outcomes, changes occurred in the predicted direction. No significant differences were recorded when comparing poverty ...So, mention the direction of the effect when it is the direction you want (but statistically insignificant from zero), and neglect to mention the direction of the effect when it is the direct opposite of what you want (but also insignificant).
Now THAT, folks, is science. (Here's the Lancet link, HT: Maham).
12 April 2025
The Impact of IPA
and the IPA website;
30 March 2025
Yawn.... more RCT debates
Martin's biggest score is the "where the hell is China?" line. Some of the other criticisms are a bit weaker.
Another likely bias in the learning process is that J-PAL’s researchers have evidently worked far more with nongovernmental organizations (NGOs) than governments.Which is a bit of a cheap shot, and a bit innacurrate. Researchers have worked with whoever will let them experiment, which yes initially was NGOs but is increasingly governments - see Peru's Quipu commission, Chile's Compass commission, the teaching assistant initiative in Ghana, working with the planning Ministry in South Africa, experimenting with police service reform in Rajasthan, even Britain's Behavioural Insights Unit.
Then
how confident can we really be that poor people all over the world will radically change their health-seeking behaviors with a modest subsidy, based on an experiment in one town in Rajasthan, which establishes that lower prices for vaccination result in higher demand?Ummmm... well thats why J-PALs policy recommendation for health pricing is based on 6 different studies....

Mark scores his biggest hit in the final footnote on the last page of his article;
Also absent is a discussion of the standard but major problem in the implementation of any programs or transfers targeted to the poor and that do not really spur development—moral hazard."Moral hazard" works at both the individual and national government level. If you get aid, you are probably less likely to work hard. The critical question is the magnitude of this effect. I think that on balance the positive value of effective aid outweighs the moral hazard, but that is more of a feeling than an evidence-based proposition. This is also one of the key points made by aid critics Bauer/Easterly/Moyo. Not necessarily that aid doesn't work, as Banerjee/Duflo would like to present their argument, but that even if aid does work, the negative moral hazard effect might outweigh the positive. I haven't seen this argument really addressed at all.
The other serious and neglected criticism for me is on general equilibrium, raised by Daron Acemoglu in the Journal of Economic Perspectives. What if you measure a positive impact of a program on earnings, but those are coming at the expense of others? A training program that increases earnings might just be equipping some individuals to out-compete others in the market, rather than necessarily increasing aggregate productivity, in which case scaling the program ain't gonna work.
So maybe I've missed them - but has anyone seen a convincing rebuttal to the moral hazard and general equilibrium critiques of micro aid project impact evaluation?
-----
Update: A couple of things I missed in my haste - Abhi points out that Rosenzweig makes good points on the sometimes tiny effect sizes lauded in Poor Economics (e.g. where "15% increase" translates to something like 2 weeks schooling or 50 cents), and that RCTs can focus our attention away from the big (important?) questions, but I felt this criticism is pretty well rehearsed.
Update 2: Also Ravallion loses points for his cliched title: "Fighting Poverty One Experiment at a Time". "x one y at a time" is a boring, tired, tired, catchphrase.
Update 3: Ravallion gains points for coining "regressionistas."
07 March 2025
The RCT Bubble
For example, while there has been a substantial growth in impact evaluations of the World Bank development projects, only 8.8% of World Bank investment loans in 2009/10 had an impact evaluation. In 1999/00 the proportion was 2.4%. ----- Martin Ravallion, World Bank Research Director
13 February 2025
Why don't microenterprises grow?
If you start trying to value the opportunity cost of that labor and you calculate it at some sort of market wage rate quickly you’ll find that many of these businesses look unprofitable ...
why it is that microfinance can get a woman to run somewhat profitable businesses with chickens and things but they never get those businesses to grow into something greater. And the whole thing is that the women in these Asian countries have no other options. When their time value is 0 they can do this but as soon as they have to hire somebody at market wage it becomes unprofitable to expand.
if you look at the Banerjee and Duflo Spandana paper, for instance, there’s a huge amount of noise in their profits data. I did some calculations and I think it came out that they would need 2 million people to find an increase of 10% in profits given the take-up of microfinance and the noise in profit measurementsFinally; I am really looking forward to Tim's new book, Experimental Conversations, based on interviews with a variety of economists conducting field experiments on poverty interventions. One of the best books I ever read on macroeconomics was a book of interviews. There is a tremendous amount that goes unwritten in journal articles. Blogs have increased access to this kind of informal chat, but there is probably still an undersupply of good ideas communicated well. In Tyler Cowen's words;
why do not more economists blog? I believe it is because they can’t, at least not without embarrassing themselves rather quickly, even if they are smart and very good economists. It’s simply a different set of skills.As Freakonomics demonstrated ably (at least the original book), researcher-journalist collaborations can be a decent way of filling the gap. David McKenzie is one of my favourite economists, doing lots of fascinating research, and he blogs. And yet I hadn't heard either of these two fairly substantial points before.
09 February 2025
Evaluating TOMS shoes, child sponsorship, cow donations, and fair trade coffee
James Choi quotes him writing in Christianity Today on Fair Trade Coffee:
Fair-trade coffee isn't a scam, but it is hard to find a development program that has attracted so much attention while having so little real impact. The most recent rigorous academic study, carried out by a group of researchers at the University of California, finds zero average impact on coffee grower incomes over 13 years of participation in a fair-trade coffee network.So I looked up his page and found this on child sponsorship:
Although international child sponsorship may be the most widespread form of personal contact between households in wealthy countries with the poor in developing countries, to date there are no published studies that have analyzed whether the beneficiaries of these programs have experienced changes in their life outcomes.
We find large and statistically significant impacts of the [compassion international] child sponsorship program on most of our outcome variables.And then there are rigorous impact evaluations to come on Heifer International and TOMS shoes - exciting stuff!
03 January 2025
When Cash isn't best
We randomly gave cash and in-kind grants to male- and female-owned microenterprises in urban Ghana. Our findings cast doubt on the ability of capital alone to stimulate the growth of female microenterprises. First, while the average treatment effects of the in-kind grants are large and positive for both males and females, the gain in profits is almost zero for women with initial profits below the median, suggesting that capital alone is not enough to grow subsistence enterprises owned by women. Second, for women we strongly reject equality of the cash and in-kind grants; only in-kind grants lead to growth in business profits. The results for men also suggest a lower impact of cash, but differences between cash and in-kind grants are less robust.
Marcel Fafchamps, David McKenzie, Simon Quinn, Christopher Woodruff
29 December 2024
So, You Want to Be a Scientist?
From the website, here is Brian Cox on what makes a good experiment;
"experiments are the most important part of science"
01 September 2024
01 August 2025
Bad Teacher
Matt Damon just gave a moving speech to a teacher's rally in DC:
My teachers were EMPOWERED to teach me. Their time wasn’t taken up with a bunch of test prep — this silly drill and kill nonsense that any serious person knows doesn’t promote real learning. No, my teachers were free to approach me and every other kid in that classroom like an individual puzzle. They took so much care in figuring out who we were and how to best make the lessons resonate with each of us. They were empowered to unlock our potential. They were allowed to be teachers.All of which I have great sympathy for. Like Matt, I was raised by a teacher, and the best ones I had in school made an incredible difference. And I am glad to have attended the local state school. But for every great teacher, there was the bad one who managed to put me off an entire subject.
And then there is the evidence, reported in the Guardian, that:
Black children are being systematically marked down by their teachers who are unconsciously stereotyping them, it has been revealed.
Academics looked at the marks given to thousands of children at age 11. They compared their results in Sats, nationally set tests marked remotely, with the assessments made by teachers in the classroom and in internal tests. The findings suggest that low expectations are damaging children's prospects.Clearly the correct balance to be struck between freedom for teachers and external accountability is a tough one. We don't know how to apply the correct incentives. Which is why the answer can only be testing - testing the tests - which brings me to New York Mayor Bloomberg's use of RCTs to test out 2 promising education initiatives in the city - conditional cash transfers and cash incentives for teachers. Both programs cost $50 million each. Neither worked, and so both schemes were promptly scrapped. Bloomberg deserves a medal (the Tim Harford award for experiments in social policy?). He just saved $50 million a year (or $100 million, if both schemes were to be run concurrently), on a program that makes plenty of intuitive sense, but that without proper testing could have gone on for years, with nobody knowing that it wasn't having any effect at all. How many other pieces of the education system are doing nothing?
14 April 2025
More Than Good Intentions
I'm about 95% certain that I would be able to tell you I love the book even if I wasn't being paid to promote it. It's like Freakonomics only about global development.
If everyone would just read this book then I would probably be out of a job because you would all be totally convinced of the need for smart evidence-based aid and know all about the fantastic research that IPA is involved with. And I still want you to read it.
So go on, make me unemployed, I dare you.
You can read Chapter 1 here.
09 April 2025
Hipsters without Borders
01 December 2024
The Lottery of Life
Save the Children have what I think is a fantastic new ad campaign highlighting the importance of luck in determining life chances. Being born in the UK almost automatically guarantees you a position as one of the richest 15% of people on the planet (that is at the basic rate of unemployment benefit for 18 year olds, excluding additional benefits).
the policy-induced portion of the place premium in wages represents one of the largest remaining price distortions in any global market; is much larger than wage discrimination in spatially integrated markets; and makes labor mobility capable of reducing households’ poverty at the margin by much more than any known in situ intervention (Clemens, Montenegro and Pritchett).People worry about the ethical implications of randomly allocating treatments in small research projects. Yet when people are randomly born in hopeless economies with tyrannical rulers, we do everything we can to prevent them escaping.
Spin the wheel for yourself and see where you could have ended up.
HT: @viewfromthecave @laurenist
24 September 2024
22 September 2024
Transparent Impacts?
New post up on the IPA blog.
21 August 2025
The future of development economics is random
This first posted on the IPA blog.
Chris Blattman notes that this Summer’s edition of the Journal of Economic Perspectives is focused on development economics. What he doesn’t note is that the articles are heavily focused upon the role of randomized controlled trials within development economics, taking perspectives that are both positive and constructively critical.
Banerjee and Duflo make the case that it is advances in empirical testing that have revolutionized the entire field.
After a period of relative marginalization, development economics has now reemerged into the mainstream of most economics departments, attracting some of the brightest talents in the field … We believe that one of the reasons for the field’s vitality is the opportunity it offers to integrate theoretical thinking and empirical testing, and the rich dialogue that can potentially take place between the two … In the last few years, field experiments have emerged as an attractive new tool in this effort to elaborate our understanding of economic issues relevant to poor countries and poor people … Much of this paper illustrates the power of this interplay between experimental and theoretical thinking.
Angus Deaton, one of the elder statesmen of micro-econometrics, and randomista-critic, argues that experimental and quasi-experimental methods answer the what question but not the how or the why.
Instrumental variables and randomized trials can play a role in uncovering the mechanisms of development. Randomized trials have a powerful ability to isolate one mechanism from another; in particular, an experiment will often allow us to short circuit the often difficult process of developing theoretical mechanisms to the point where they can be convincingly tested on nonexperimental data. At the same time, the routine use of instrumental variable methods and of randomized controlled trials for project evaluation is often uninformative about why the results are what they are, and in such cases, nothing is learned about mechanisms that can be applied elsewhere.
Daron Acemoglu raises an important concern for scale-up, which is the question of how the effects of a project tested on a small scale, may have different impacts on a larger scale. He advocates the careful use of economic theory to help alleviate these concerns.
General equilibrium and political economy issues often create challenges for this type of external validity…General equilibrium and political economy considerations are important because partial equilibrium estimates that ignore responses from both sources will not give the appropriate answer to counterfactual exercises.
How do we convince others and ourselves that our estimates have external validity and can be used for policy analysis or for testing theories? This is where economic theory becomes particularly useful.
And finally Dani Rodrik makes the case for his particular brand of theory; the diagnostic approach, as a tool to be used in conjunction with randomized experiments for helping to overcome the problem of external validity and deciding which interventions are likely to be most powerful in which contexts.
Ideally, diagnostics and randomized experiments should be complementary; in particular, diagnostics should guide the choice of which random experiments are worth undertaking. Any developmental failure has hundreds of potential causes. If the intervention that is evaluated is not a candidate for remedying the most important of these causes, it does not pass a simple test of relevance. Yet the tools of diagnostics remain surprisingly underresearched.
07 August 2025
Reintegrating rebels
The results found that the economic assistance targeted at these individuals did improve their economic position, it did little to affect their political and social reintegration.
I’m looking forward to seeing the results of Chris Blattman’s project with IPA doing a full-on randomisation of a reintegration project in Liberia.
HT: TH
06 August 2025
The Impact of Evaluation
Note: This was first posted on the IPA blog.
Alanna Sheikh started a bit of a debate last week on the limitations of impact evaluations. She cites Andrew Natsios (a former USAID administrator)
USAID has begun to favor health programs over democracy strengthening or governance programs because health programs can be more easily measured for impact. Rule of law efforts, on the other hand, are vital to development but hard to measure and therefore get less funding.
Lots of things are vital for development, but something being vital doesn’t mean that aid funding is necessarily an effective way of supplying it. Not only that, but something being difficult to measure does not make it impossible. And sure enough, JPAL and IPA have conducted a number of evaluations of governance projects, such as working with the police in Rajasthan, on peace education and ex-combatant reintegration projects in Liberia, and evaluating anti-corruption strategies in Indonesia.
Randomised impact evaluations give the strongest evidence available on a project’s effectiveness. If USAID is beginning to favor projects with evidence of impact that is a good thing. The challenge for governance and rule of law advocates is to prove their impact.
Dennis Whittle of Globalgiving.org adds another limitation:
Formal evaluations, including the gold standard of randomized controlled trials, are not scalable. We simply do not have the time and resources to do centralized, in-depth evaluations of everything.
This argument is like not bothering with lifeboats if they can’t fit everyone in. Evaluations are crucial if we are going to learn whether or not we are wasting our money. And who knows, we might not be able to evaluate every single project, but if we keep coming up with compelling theories of change and keep replicating our findings in different settings, we could certainly try to evaluate every single intervention.