11 July 2019

Rwanda Could Thrive Faster by Putting (Even) More Women in Charge

Rwanda has the largest share of female parliamentarians of any country in the world. But things in Rwanda seldom sit still, and it is time to start going faster and further. 
Rwanda’s impressive female representation in national politics is not yet replicated at the local level. Only eight of the thirty district mayors in Rwanda (twenty-seven percent) are women, and just thirteen percent of village leaders are women.
Me in The Kigalian

21 May 2019

Cassette tapes "cause" conflict

Nathan Nunn and Nancy Qian published a very worrying paper in 2014 showing that US food aid causes conflict in recipient countries. Their research design used total US wheat production as a source of quasi-experimental variation in the amount of food aid countries received, to show causality rather than just correlation.

A new paper by Paul Christian and Christopher Barrett apparently debunks the study, showing that the "causal" correlation is spurious. Replace "US wheat production" with "US tape cassette sales" and you can almost exactly replicate the results.

Which reminds me of an earlier paper showing that "average male organ length" is a strong predictor of GDP growth. We only have about 200 countries, which is not a lot of observations to power a robust statistical analysis, so you should take most cross-country empirical analyses with a pinch of salt. These "male organ" and "cassette sales" papers are helpfully colourful reminders.

HT: Jeffrey Bloem

07 May 2019

The Global Education Architecture is Broken

So says Nicholas Burnett in a (sadly, gated) essay for the International Journal of Educational Development.

Nicholas has some authority on the topic, as chair of the Board of UNESCO’s International Institute for Educational Planning, and previously Director of the UNESCO Education for All Global Monitoring Report.

In his opening paragraph he writes;
"The international architecture for education is failing the world. There is little leadership; global priorities are obscure; the major debates are increasingly irrelevant and divorced from reality on the ground; the number of children out-of-school has stagnated for a decade; little progress has been made in tackling the global learning crisis; knowledge about what works in education is surprisingly limited; global public goods are massively underfunded; huge global financing requirements show little prospect of being met; and the neediest low-income countries, mainly in sub-Saharan Africa, do not receive the external financial and technical support necessary if they are to develop their education systems."
Tell us what you really think Nicholas. 

He criticises:
"the work of the Right to Education Initiative lobby and its recent Abidjan Principles, which would have governments place severe restrictions on private education."
Points to a major problem being the decline of UNESCO:
"In the past, UNESCO could have been counted on to be the central voice advocating for education, including education as a human right, in international fora. UNESCO has become so weakened, however, by its internal politicization and inadequate budget, that it is no longer the respected international voice on education, rather just one of many rather weak voices. These problems preceded the withdrawal first of United States' financial support and, more recently, of US membership but these steps now mean that UNESCO cannot function effectively as it has insufficient resources. UNESCO’s total regular budget for education is now only $51 million per year."
With a few other choice quotes:
"It is a real paradox that those working in international education increasingly (and rightly) call for systems-wide approaches but fail to study their own non-functional international architecture system. 
It is astonishing both how little we know about what works in education and how poorly we disseminate what we do know. 
If the situation is bad regarding generating knowledge, it is even worse regarding promoting innovation in education. 
There is thus no systematic regular review of how the international architecture is performing."
Well written and well worth reading in full.

17 April 2019

Coaching is better than training, but there is still a questionmark on scalability

"So should governments switch to frequent coaching sessions? Possibly, but the next step should first be to try this type of intervention at scale. 
Finding three highly skilled coaches is one thing, but you might need hundreds or thousands of them if you were to run a similar programme across an entire country. 
One potential route to scale is through new uses of technology. A study in Brazil found positive impacts of a virtual-coaching programme run via Skype, for example. 
But perhaps the most straightforward type of technology to go for is scripts, which this paper suggests have positive effects on learning both when presented through centralised training and intensive coaching."

12 March 2019

"Maybe one of the most cost-effective interventions ever studied"

In this month's TES column (I'm calling it a column, it sounds better than a blog), I call parent-teacher meetings in Bangladesh "maybe one of the most cost-effective interventions ever studied". Here's the maths behind that claim. 

First, the intervention found 0.377 standard deviation effect on Grade 5 scores and 0.141 standard deviation (not statistically significant) effect on Grade 3 scores. If we take the average of those, that is 0.259. That's equivalent to around 1.7 extra years of school (based on Evans & Yuan's estimate that 1 standard deviation ~ 6.5 years of school).

The cost was $3 per student over the two years. The author Asad Islam does the conversion using only the 0.377 effect size for Grade 5, writing "Thus, the cost per average 0.1 SD increase in test scores per student is $0.66 or $1.58 for the full program over 2 years."

J-Pal put together a list of the cost-effectiveness of different interventions on their website, now gone, but replicated by Romero, Sandholtz, & Sandefur in the Liberia Partnerships Schools paper (copied below). Islam's $1.58 per 0.1 SD increase is equivalent to 6.3 standard deviations per $100. If we use the more conservative estimate of 0.259 SD (averaging across Grade 5 and Grade 3 results) that still works out at 4.3 SD per $100 spent. That lower estimate still puts this intervention at third place in the ranking, so there you go: "maybe one of the most cost-effective interventions ever studied".

11 March 2019

The Latest Economics Research on Global Education

Last week I was at the Society for Research on Educational Effectiveness (SREE) conference. Alex Eble made a big and apparently successful push to increase representation by researchers focused on developing countries. In time-honoured Dave Evans style, here's my one-sentence roundup of 22 idiosyncratically selected studies presented at the conference. You can see the full programme here


Public-private partnerships

A subsidy for private schools in Haiti lead to higher enrolment (Adelman, Holland, and Heidelk) #Haiti

Chile has a universal school voucher and a higher voucher targeted at low-income students. The universal voucher is better for aggregate efficiency but worse for equity (Sanchez) #Chile #StructuralModel

Giving out vouchers to attend 5 years of low-cost private primary school in Delhi led to worse Hindi scores and no change in English or Maths (Crawfurd, Patel, and Sandefur) #India

Contracting out management of public schools to NGOs in Liberia led to a 60% increase in learning (Romero, Sandholtz, and Sandefur) #Liberia

School management

A mobile-phone based support programme for school councils in Pakistan led to no improvement for students (Asim) #Pakistan #Diff-in-Diff

A major school inspection reform in Madhya Pradesh led to no improvement in schools (Muralidharan and Singh) #India

Independent monitoring of teachers led to better student performance (Kim, Yang, Inayat) #Pakistan #Diff-in-Diff


Mindfulness interventions reduced sadness and aggression of children in Niger (Kim, Brown, De Oca, Annan, Aber), improved concentration and prosocial behaviour in Sierra Leone (Brown, Kim, Annan, Aber), and increased prosocial behaviour amongst Syrian refugees (Keim and Kim) #Niger #SierraLeone #Syria

Information for parents

Giving parents information about their child’s performance led to some temporary improvements (Barrera-Osorio, Gonzalez, Lagos, Deming) #Colombia

Incentives for teachers 

The theoretically optimal “Pay for Percentile” incentive scheme works to increase effort, which is complementary to inputs (Gilligan, Karachiwalla, Kasirye, Lucas, Neal) #Uganda

BUT A simpler “threshold” incentive scheme can be as effective as the theoretically optimal “Pay for Percentile” (at least in the short-run) (Mbiti, Romero, Schipper) #Tanzania 


Studies commissioned by the developer of an intervention find effect sizes 80% larger than studies commissioned independently (Wolf, Morrison, Slavin, Risman) #USA #MetaAnalysis #EvaluatorIndependence

Tests designed specifically for evaluations produce effect sizes 63% larger than generic tests (Pellegrini, Inns, Lake, Slavin) #USA #MetaAnalysis #TestDesign

External validity bias (non-random selection of schools into trials) is twice as big as internal validity bias (from using observational not experimental methods) (White, Hansen, Lycurgus, Rowan) #USA #ExternalValidity


The One Laptop Per Child programme in Peru had zero effect on learning (Cristia, Ibarrarán, Cueto, Santiago and Severín) #Peru

In addition, providing internet had no effect on student learning (Malamud, Cueto, Cristia, Beuermann) #Peru

Peer effects

Being the weakest student in a better (selective) school can be worse than being the strongest student in a worse school (Fabregas) #RDD #Mexico


Temporary subsidies can have permanent effects on enrolment (Nakajima) #Indonesia #Diff-in-Diff

Merit-based scholarships have bigger effects than need-based scholarships (Barrera-Osorio, de Barros, Filmer) #Cambodia


Each 1 degree Fahrenheit of school year temperature reduces learning by 1 percent. Air conditioning entirely offsets this. (Goodman, Hurwitz, Park, Smith) #FE #USA  

18 February 2019

Is testing good for education?

This post was first published on the Centre for Education Economics website. 
I blogged recently about a new RISE working paper by Annika Bergbauer, Eric Hanushek, and Ludger Woessmann, which finds that:
“standardized external comparisons, both school-based and student-based, is associated with improvements in student achievement.”
William Smith pointed me to his rebuttal blog written with Manos Antoninis, which argues that there are “multiple weaknesses in their analysis that undermine their conclusions”.

This blog is my attempt to make sense of the disagreement. The main issue appears to me to be a misunderstanding by Antoninis & Smith (“AS” from here on) of the mechanism proposed by Bergbauer, Hanushek, and Woessmann (BHW). AS presume that the main mechanism through which testing is hypothesised to improve outcomes is through school choice (allowing parents to shift their students to schools with better test scores) or through punitive government accountability for teachers and schools. But BHW make it clear that their main focus is on the principal-agent relationship between parents as the principal and both students and teachers as their agents. Parents can’t observe the effort made by students and teachers, but standardized testing can provide them with a proxy indicator for effort. This should induce greater effort from both students and teachers. This proposed mechanism has nothing to do with school choice or accountability from government.

First AS argue that
“Our review of the evidence found that evaluative policies promoting school choice exacerbated disparities by further advantaging more privileged children (pp. 49-52).”
This review of the evidence in pp 49-52 of the UNESCO Global Monitoring Report focuses on policies designed to promote school choice. But that is not at all the focus of the BHW analysis, which is on policies that allow for the comparison of schools and students with the purpose of incentivising greater effort. School choice doesn’t need to have anything to do with it. As BHW write:
“That is the focus of this paper: By creating outcome information, student assessments provide a mechanism for developing better incentives to elicit increased effort by teachers and students, thereby ultimately raising student achievement levels to better approximate the desires of the parents”
Second, AS argue that
“punitive systems had unclear achievement effects but troublesome negative consequences, including removing low-performing students from the testing pool and explicit cheating (pp. 52-56).”
As mentioned above, the proposed mechanism in BHW does not at all require a punitive system. BHW write
“accountability systems that use standardized tests to compare outcomes across schools and students produce greater student outcomes. These systems tend [my emphasis] to have consequential implications and produce higher student achievement than those that simply report the results of standardized tests.”
Having said that, there are some flaws in the literature review cited by AS. This section first cites studies on four individual countries (US, Brazil, Chile, South Korea), without noting that there are significantly positive results from two of them. One of the two papers they cite on Brazil (IDados 2017) concludes that there was “a large, continuous improvement in all those years in both absolute and relative terms when compared to other municipalities in the Northeastern region and in Brazil as a whole ” and “it is very likely that [the reform] is at least partially responsible for the changes.” On Chile, a paper not cited as it was published in 2017 just after the review was completed (Murnane et al) found that “On average, student test scores increased markedly and income-based gaps in those scores declined by one-third in the five years after the passage of [the reform]”.

Next the review cites two papers (Yi 2015; Gándara and Randall (2015) that present correlational analysis with no attempt to address any potential bias from omitted variables or reverse causality. The latter study is based on a small sub-sample of the fuller data used by BHW.

Next AS take issue with the way that BHW construct their 4 categories of test usage. For ease of reference I first reproduce below the 4 categories, along with the wording of the questions that go into constructing each category.

1. Standardized External Comparison
  • “In your school, are assessments of 15-year old students used to compare the school to district or national performance?” (PISA)
  • existence of national/central examinations at the lower secondary level (OECD, EAG)
  • National exams (primary) (Euryadice (EACEA))
  • Central exit exams end secondary (Leschnig, Schwerdt, and Zigova (2017))

2. Standardized Monitoring
  • “Generally, in your school, how often are 15- year-old students assessed using standardized tests?” (PISA)
  • “During the last year, have [tests or assessments of student achievement] been used to monitor the practice of teachers at your school?” (PISA)
  • “In your school, are achievement data … tracked over time by an administrative authority[?]”

3. Internal testing
  • whether assessments are used “to inform parents about their child’s progress.”
  • use of assessments “to monitor the school’s progress from year to year.”
  • “achievement data are posted publicly (e.g. in the media).” (vaguely phrased and is likely to be understood by school principals to include such practices as posting the school mean of the grade point average of a graduating cohort, derived from teacher-defined grades rather than any standardized test, at the school’s blackboard.)

4. Internal teaching monitoring
  • whether assessments are used “to make judgements about teachers’ effectiveness.”
  • practice of teachers is monitored through “principal or senior staff observations of lessons.”
  • “observation of classes by inspectors or other persons external to the school” are used to monitor the practice of teachers.

First, AS argue that question 3c should really fall under category 1. The effect of this question on outcomes is primarily statistically insignificant, though for Maths and Science the direction of the coefficients in the interacted model are the same as the other variables in category 1 (positive in the base model, with a negative coefficient on the interaction with initial score). Would adding this one variable to the 4 variables already in the category make the results statistically insignificant overall? I think probably not, but can’t say for sure without looking at the raw data.

Second, AS claim that question 4a should really fall under category 1 or 2. This claim seems debateable. The theoretical mechanism that BHW put forward is that providing credible information to parents induces greater effort from teachers. This use of testing is clearly internal to the school, and could clearly mean internal school assessments rather than necessarily standardized assessments that allow for external comparison with teachers at other schools.

Third, AS criticise the inclusion of high stakes student assessments as indicators, as by placing the stakes on students and not schools they do not relate to accountability from government. But this is not what BHW claimed was driving the effect.

Fourth, AS suggest the use standardized testing in grade 15 may be effectively “teaching to the test”. This seems odd to me - they clearly aren’t literally teaching to the test because it is a different test. BHW are looking at the effect of introducing high-stakes national standardized testing on student results in a totally separate, low-stakes sample-based test (PISA). AS then don’t really address the argument that “teaching to the test” can also be a positive thing if the test is well-designed and includes a good sample of the things that students are expected to have learnt.

Finally, AS focus only on those results that are statistically significant in the baseline model (estimating the average effect across all countries). However they miss a really important conclusion from the paper which is about heterogeneity. The effects of testing are largest for the weakest performing systems. This is clear in Figure 3.

Looking at the interacted model (Table A5), both of the other 2 questions in category 2 (2b and 2c) are statistically significant.

To sum up, there are weaknesses in the interpretation by AS of BHW which undermine their criticism. BHW focus on the role that testing can play in increasing the effort of students and teachers, with or without government accountability systems. In addition, the review of government accountability systems presented in the UNESCO Global Monitoring Report also has weaknesses, and presents an unduly negative picture. My prior remains that standardized testing plays a positive role, particularly in weak systems.

Thanks to Gabriel Heller-Sahlgren, William Smith, and Manos Antoninis for comments on a draft of this post. This acknowledgement clearly does not imply that Smith and Antoninis agree with this post - they don’t!

06 February 2019

CfEE Blogging: Giving students information on future wages improves school outcomes

As of this January and following last year's Annual Research Digest from the Centre for Education Economics, I'll be co-editing the Monthly Digest, along with Gabriel Heller-Sahlgren.

This is basically an excuse and commitment device to get me actually blogging again on at least a monthly basis. Each issue will include commentary on new papers, plus a selection of abstracts from recent publications (lightly edited for jargon).

My first comment is on a new paper by Ciro Avitabile and Rafael de Hoyos
Did you know what career you wanted to do when you were in secondary school? I didn’t. Most pupils make critically important choices that will affect their lives throughout their educational career, often on the basis of poor information about what those choices will mean for their future. In most countries, there is little transparency on the costs and benefits of pursuing education and information on the various career paths available. 
In this paper, Ciro Avitabile and Rafael de Hoyos study whether or not providing pupils with better information about the earnings returns to education and the options available lead to greater effort and learning. Several studies have previously shown that providing information about the wage gains from schooling leads pupils to stay in school a bit longer, and affects their educational choices, but there is limited evidence that such information can affect learning per se, at least in a slightly longer-term perspective.

16 January 2019

PubhD Kigali

For any readers in (or visiting) Kigali (presumably a niche audience), I've started a monthly research talk event, using the PubhD format that is going in around 20 European cities now.

3 speakers get 10 minutes each to present their research, followed by Q&A. It's a great way to learn a bit about some random subjects you might not have considered much before, for the speakers to practice their extended elevator pitch, and a pretty low-effort way of organising some kind of regular academic vaguely seminar-like discussion for me.

The next one is this Thursday at 7.30pm, see here for more details, and get in touch if you'd like to speak sometime.

Does temporary migration from rich to poor countries cause commitment to development?

Nevermind that none of the journals I've sent it to so far are interested, my new working paper got picked up by Marginal Revolution the king of economics blogs, which is probably way better anyway right?
Public support in rich countries for global development is critical for sustaining effective government and individual action. But the causes of public support are not well understood. Temporary migration to developing countries might play a role in generating individual commitment to development, but finding exogenous variation in travel with which to identify causal effects is rare. In this paper I address this question using a natural experiment – the assignment of Mormon missionaries to two year missions in different world regions – and test whether the attitudes and activities of returned missionaries differ. Data comes from a unique survey gathered on Facebook. Missionaries assigned to treat regions (Africa, Asia, Latin America) are balanced with those assigned to the control region (Europe) on high school test scores and prior language and travel experience. Those assigned to the treatment region report greater interest in global development and poverty, but no difference in support for government aid or higher immigration, and no difference in personal international donations, volunteering, or other involvement.
Here's the link to the paper and the twitter discussion

15 January 2019

Testing, testing: the 123's of testing

Here's my summary of the new Annika Bergbauer, Eric Hanushek, and Ludger Woessmann working paper for CFEE.
"teachers tend to oppose standardised tests, partly because they perceive them to narrow the curriculum and crowd out wider learning. However, it is intuitive that the effects of testing could vary dramatically by context. Indeed, the impact may very well follow a so-called “Laffer curve”. At low levels of testing, an increase may lead to better performance as it provides relevant information and incentives to actors in the education system. Yet if there are already high levels of testing, further increases may very well decrease performance, due to stress, for example, or the effects of an overly-narrowed curriculum. If so, we should expect the impact of testing to follow an inverted U-curve – or at the very least display diminishing returns. Furthermore, the impact of tests is also likely to depend on exactly how they are used in the education system. 
This paper provides perhaps the first systematic evidence on these issues"
Read the rest here.

27 September 2018

The best thing about cash benchmarking is it highlights just how small most aid is

The best take on the new cash benchmarking study from Rwanda was this one by Michael Kevane:
"the main takeaway is that neither intervention (when evaluated at the low Gikuriro cost of $141 per household) improved child outcomes." Yikes. I guess though if household size = 7 that is only $20 per person.
Should we really be surprised that giving someone $20 doesn't improve any measurable outcomes? Maybe Kevane's maths is wrong and household size is smaller, but $20 is so small we could double it and still not expect to see anything.

That's one big advantage of cold hard individual cash transfers, they make explicit what the actual amount per person is, which is not so obvious when its one big community project costing loadsamoney and affecting loadsapeople. John Quiggin made this point years ago.

Last year, DFID gave £2.6 BILLION pounds to countries in Africa. That's so much money! How are they still so poor when we give them BILLIONS of pounds every year? Well, hold on, there are 1.2 billion PEOPLE in Africa. So that works out at just over £2 each for the whole YEAR. Of course it doesn't go to everyone, let's say the money is perfectly targeted on the poorest 10% of people. So they get £20 each.

Somehow there is a lot of magical thinking that by pooling money together it somehow automatically has totally outsized impacts. Of course its possible that smart investment in research or better governance can have truly outsized impact if it can nudge a country toward a slightly higher growth rate, but that isn't what most aid is even trying to do, and even when it is they stuff is wicked hard and we can expect most attempts to fail.

Aid is great but less hubris please. And less ridiculous implicit expectations from what aid could plausibly achieve from the sceptics too.

Don't Buy Local

This summer I finally got around to doing my first Park Run, joining the now millions of people around the world who turn up on Saturday mornings for a free 5km timed run around their local park (I managed a not too shabby 27 minute time, roughly average for my age). 

I also work on global development, so was pretty disappointed to receive an email announcing a new clothing line from the founder of Park Run that will be manufactured exclusively in Europe. Paul Sinton-Hewitt CBE was concerned with the “horrendously exploitative … factories in the Far East employing questionable practices, paying the lowest wages and exposing their workers to dangerous conditions”. 

Paul is right to be concerned about the wellbeing of East Asian factory workers, but moving those jobs to Europe is not the solution. Moving manufacturing jobs from places where jobs are scarce, to Europe where jobs are not scarce, is not a good thing. 

My point is not a new one; Paul Krugman, the nobel-prize winning left-wing economist wrote more than ten years ago in praise of sweatshops. But the point still stands, and is still apparently missed. The factories in which most of our sports gear are made have poor working conditions compared with jobs in rich countries, but they are usually preferable to the actual alternatives facing people unfortunate enough to be stuck in poor countries.

In 1980, before it became the workshop of the world, extreme poverty in China was close to 90 per cent. Today it is less than 1 per cent. That change would not have been possible without the manufacturing industry. It’s hard to think of anything else that has had a bigger positive impact on human wellbeing that the transformation of China’s economy. 

And it’s not just China that has benefited. Now that wages are starting to rise in China, manufacturers are moving on to other lower-income countries, such as neighbouring Vietnam and Cambodia, but also further afield, to Ethiopia and Nigeria. 

Researchers Rachel Heath and Mushfiq Mobarak looked at what happened in Bangladesh when garment factories started opening. As factories rewarded basic literacy and numeracy, girls who lived in villages near to new factories chose to stay in school longer. The effect on education was bigger than a government social programme explicitly designed to increase schooling. In Ethiopia, Chris Blattman and Stefan Dercon found that people use factory jobs as a safety net. Pay and conditions may be poor in the new factory jobs, but they are always there, even when other informal means of getting by fall through. 

So what is a concerned Park Runner to do? Ultimately pay and conditions in poor countries will only improve when workers get better outside options. Thanks to the hard work of poor Chinese in the 1980s and 1990s, China’s economy has grown, wages have risen, and Chinese workers today can afford to ask for more and turn down the worst offers. The best way to encourage that trend is to continue to buy things made in poor countries. Other ways to give poor people more options are to just give them cash directly (you can do literally that at givedirectly.org), to go on holiday to poor countries and spend money on services from poor people, or if you’re a citizen of a rich country to encourage your government to make it easier for people from poor countries to come and work in your country. 

I have nothing but admiration for Paul Sinton-Hewitt’s founding of Park Run, and of his desire to create an inclusive and ethical line of sportswear. I just hope he reconsiders his decision not to support jobs where they are needed most.