Update for Potential Researchers

Since I graduation from the MPA program, I have been on a programming bonanza. I have found it is crucial to my job as a researcher. Forget Stata, the real party is elsewhere. Check out my sister blog that looks at Python and it associated libraries, and includes some data visualisations. Its at a new stage, but is developing nicely. Don’t waste a minute more, seriously it is a skill you will not regret acquiring!


Happy panda




Notes from Lecture and Various Papers 

Instrumental Variables

Instrumental variables are used when OLS estimates are biased by endogeneity or measurement error. The process is based upon identifying exogenous variation in the key independent variable.


I’m not going to go into how the IV estimator is constructed as it is well documented in EC406 notes, or see e.g. Stock and Watson.


If the regression is overspecified (i.e. there are more instruments than endogenous regressors) then a Hansen-Sargan test can be used to test the exclusion restriction – although the instruments will pass the test if they are all equally endogenous i.e. it is a weak test. In general the F-stat should be > 10 in the first stage, and there should be a strong theoretical reasoning behind the instrument (such that the “compliers” are meaningfully identified).


In the spatial context spatially lagged X variables have been seen to be used as instruments for the spatial lag of Y. However, as we have already seen this method is not without its complications (correctly specifying the functional form/ exogeneity restriction violated). Thus, the literature has begun to move toward adopting the quasi-experimental method by searching for instruments based on policy changes, boundaries, geological features etc. or other similar type events.


Some examples


Hoxby, Does Competition among Public Schools Benefit Students and Taxpayers? (2000)

This is surely one of the most famous examples of a spatial IV. The paper examines whether increased school competition in the form of a greater number of school districts within a municipality has benefits for the population studied. OLS estimates are biased because the supply of school districts is in part a response to the demand for school districts which is probably driven by wealth, ability, parental involvement, and other unobservable characteristics which codetermine student outcomes and cannot be readily controlled for. Thus, Hoxby uses an instrument to attempt isolate exogenous variation in the supply of school districts, to get a consistent estimate of the effect competition has on student outcomes.  The instrument is based on the number of streams and rivers within a municipality. The logic is that in the 19th century when school districts were being drawn up, geological features such as streams presented barriers to movement such that districts were often drawn up with the streams forming natural boundaries. Thus, a municipality with more streams would have more school districts, hence the instrument is relevant. Over time, the importance of streams in terms of determining outcomes has diminished, and hence the presence of more streams has no effect on educational outcomes other than through its effect on determining school districts in the 19th century, and hence the exclusion restriction is satisfied.


There are problems with the strategy. Specifically, rivers may still have an economic effect today, and this could feed back into educational outcomes. Additionally, the way the instrument was constructed has been criticized, as it was subject to much subjective judgement.


Luechinger, Valuing Air Quality Using the Life Satisfaction Approach (2009)

This paper is trying to gauge how important air quality is for affected populations. The hedonic method of valuation (which seeks to determine the unobserved price of a public good by using prices embedded in private goods) tends to underestimate the value of air quality as migration is costly, and private goods prices are based on perceived rather than objective risk. Any residual effect that air pollution has on life satisfaction is indication that compensation has not been fully capitalized in house prices for the reasons just stated.


However, an OLS estimates of air quality on life satisfaction would be biased as cleaner air is the product not only of exogenous policy change (even assuming it is exogenous), but also of local industrial decline and economic downturn. These simultaneous developments can have a countervailing effect on life satisfaction and housing rents. Thus he uses an instrument for SO2 levels; the mandated installation of scrubbers at power plants


The construction of the instrument is somewhat convoluted as it is relies upon a difference in difference estimation. Desulphurization pursuant to retroactive fitting of scrubbers at power plants is the treatment, with a county being down or upwind of the power plant determining assignment to treatment and control group respectively. Yet, as being in treatment/control is a question of degree rather than kind, the treatment group variable is a frequency measure of how often in the period of study the county in question is downwind of the plant. This is likewise multiplied by a distance decay function and the pre-desulphurization levels of the plant in question is controlled for.


The main finding is that SO2 concentration does negatively affect life satisfaction, with estimates being much larger for the OLS specification indicating that reductions in sulphur levels are indeed accompanied by factors that have a countervailing effect on satisfaction.


Gibbons et al. Choice, Competition and Pupil Achievement (2008)

This paper uses a boundary discontinuity in order to construct an instrument for primary school competition in the UK which gets around the endogeneity concern in OLS estimates, namely that motivated parents may move closer to popular schools. The boundaries in question are the Local Education Authority boundaries. Whilst families are allowed to make application to schools outside of their LEA, cross-LEA attendance is extremely uncommon.


They construct indices for choice: for each school they define a travel zone school  that a) encompasses all residential addresses within the same LEA and b) that are contained within a circle whose radius is the median of the travel to school distance for the pupils at that school. Pupil choice is thus the number of travel to school zones in which the student lives, and the School competition measures is the average of this value for students actually attending a given school (i.e. the number of alternatives available to student of a particular school). If families sort spatially near to high performing schools this will tend to decrease apparent competitiveness.


They then exploit the fact that families living near boundaries face longer journeys to school than those in the interior, and as such they are more likely to attend their local school. This is because the catchment area is bounded and hence shrinks. Thus the distance between a pupil’s home and the LEA boundary is an instrument for school choice, and the distance between a school and the boundary is an instrument for competitiveness. They do not find evidence that school competition increases pupil achievement.


Differencing Methods

Often there will be spatial sorting an heterogeneity i.e. differences between places that lead to biased estimates. This sorting will often be on observable characteristics, but just as frequently on unobserved characteristics.


One method for dealing with this is the fixed effects model. This can be estimated with panel or cross sectional data using area dummies, or by making the within groups transformation (de-meaning) and then estimating with OLS. This removes the area specific time invariant determinants of the dependent variable.


With panel data can be time differences which has the same effect. Time dummies can also be included to strip out variation common across regions due to time trends. The remaining variation is time variant region specific variation, and as such for the estimates to be unbiased there can be no correlation between region specific time variant shocks and the error term. For example, there could be no sudden shock to the educational system in a given area that induced people to sort spatially into that area.


The difference in difference method is usually applied to evaluating policy interventions where a treatment and control can be created. I am not going to go into the mechanics here as it is well documented elsewhere.


Some Examples


Manchin et al. Resources and Standards in Urban Schools (2007)

The paper is concerned with whether additional resources can be used to improve the outcomes of hard to reach pupils specifically evaluating Excellence in Cities programme that gave extra funding to schools based upon their level of disadvantage as measured by the proportion of pupils eligible for a free lunch. They use a DID strategy comparing the outcomes in EiC schools with a comparison group. A direct comparison between EiC and non-EiC schools would not be valid as there is no reason to assume that the parallel trends assumption holds. Mindful of this the authors using propensity scores based on a host of school and pupil level characteristics to create a subset of non-EiC schools which are statistically similar to the pre-treatment EiC as schools, and they use this subset as the control group. They do not make a hugely convincing argument for this method, and indeed there are statistically significant differences in the outcome measures in the pre-treatment periods, indicating that there is only limited reason to suspect that the key identifying assumption holds.


They find that the policy was effective in raising pupil attainment in the treatment schools but that the benefits were restricted to the students best able to take advantage of the policy (i.e. the most gifted).


Duranton et al. Assessing the Effects of Local Taxation Using Microgeographic Data (2011)

This in an interesting paper that seeks to identify the effect of local property taxation on the growth of firms. Estimating this has been difficult as site characteristics are heterogeneous, and many characteristics will be correlated with unobservable determinants. Secondly, firms are heterogeneous, and this differences are often largely unobservable, yet these differences cause them to sort spatially. Lastly tax systems may be endogenous to location decisions of firms.


Using panel data they estimate a model which includes firm specific observable characteristics which removes firm specific time varying observable variation. They include a firm fixed effect to remove the time invariant firm specific unobservable variation. They also include higher level fixed effects (site, and region). They then difference the data in the usual way which implements the fixed effect strategy as noted above.


They then do a spatial difference. This takes the difference which is the difference (in the difference) between each establishment and any other establishment located as a distance less than d from that establishment. If there is a term αzt for each site z in time t, and this is not controlled for, then any local shock to firms that also affects tax rates will bias the panel estimates above. However, if we are able to assume that for small changes in d, Δαzt ≈ 0 (i.e. local shocks are smooth over small amounts of space), then by spatially differencing the alpha term falls away, and the time varying local shocks are effectively controlled for.


They then combine this with an instrumentation strategy that instruments tax rates using political variables.  




Notes from Lecture 


Firms and individuals have choices over discrete alternatives such as which mode of transport to take, or where to locate their businesses. These choices are modeled using the random utility model in order to aide in economic interpretation of those choices.

Random Utility Model

This was developed by Daniel McFadden and underlies the discrete choice model. This model holds that preferences over alternatives are a function of biological taste templates, experiences and other personal characteristics some of which are observable, others of which are not (cultural tastes etc.), and the function is heterogeneous within a given population. This indicates that an individual/firm’s utility from choice j can be decomposed into two components:

Uij = Vij – εij 

where V is an element common to everyone given the same characteristics and constraints. This might include representative tastes of the population such as the effects of time and cost on travel mode choices. ε is a random error that reflects the idiosyncratic tastes of the individual concerned as well as the unobserved attributes of the choice j.

V is observable based on consumer/firm choice characteristics such that:

Vij = αtij + βpij + δzij

where t is time and p is price and z is other observable characteristics.

In a setting where there are two choices (e.g. car or bus to work) we observe whether an individual chooses car (yi = 0 ) or bus (yi = 1). Assuming that individuals maximize their utility, they will choose bus if this exceeds the utility from going by car Ui1 > Ui0 which means that Vi1– εi1 > Vi0 – εi0 which indicates that εi1 – εi0 < Vi1 – Vi0. Therefore the probability that we see an individual choose to go by bus is:

P(εi1i0 < Vi1 – Vi0)

Which is equal to P(εi1– εi0) < α(Ti1 – Ti0) + β(Pi1 – Pi0)) 

If we are willing to assume that the probability depends linearly on the observed characteristics then this can be estimated by running the following OLS regression:

Yi1 = α(Ti1 – Ti0) + β(Pi1 – Pi0) + εi1

At this point further observable characteristics can be added, z.

However, as is well known, the OLS model is not bounded by 0 and 1, whereas probability functions are. This means that this estimation may return results outside the possible range of probabilities. In order to counter this problem we can estimate a probability function using probit or logit estimators which are calculated using the maximum likelihood method [of which I am not going to which anything – assuming it will not be examined in detail].

The McFadden paper deals with car versus bus commuting in the SF Bay area.

Multiple Choices

Often we want to think about more than one choice, which requires us to extend this model. We can extend the random utility model to many choices Uij = Vij + εij. Now an actor will choose alternative k if the utility derived from this choice is higher than for all other choices:

Vik + εik > Vij + εij for all j≠k 

If we assume an extreme value distribution then the solution for the probability choice is given by P(yi = k) = exp(Vik) / ∑ exp(Vij). This is a generalization of the logit model with many alternatives, hence the name “multinomial logit”. The model compares choices to some predetermined base case.

Independence of Irrelevant Alternatives (IIA)

One drawback of the multinomial logit method is the IIA problem. This is driven by the assumption underlying the model, that if one choice is eliminated in time t=1, the ratio of individuals choosing the remaining option much remain constant from the pre-elimination period t=0. For example if in t=0 40 people take bus A, 12 people take bus B and 20 people drive, and then in t=1 the B bus company goes bust. In t = zero, the ratio of people driving relative to those taking bus A is 2:1. This must remain constant in t=1 so the model assumes that 24 people will dive and 48 will take bus A. This might not be a valid assumption is bus seats are not supplied elastically, or bus A and bus B were not substitutes.

It is simple to see why this is the case, as the underlying assumption of the model is that P(yi = k) = exp(Vik) / ∑ exp(Vij), and this clearly cannot change simply because one of the other alternatives has been eliminated. 

This can be solved using the nested logit model. Conceptually this decomposes the choices into two separate stages. In the first stage the individual chooses whether to take his car or public transport. If he decides on public transport then he must decide between bus A and bus B. This choice structure is estimated using sequential logits whereby the value placed on the alternatives in the second stage are entered into the choice probabilities in the first stage.

Aggregate Choice Models

Aggregate choice models are useful when individual data are not available, and also when computing power is an issue (due to many fewer observations). All of the above models have aggregate equivalents. In fact, using the Poisson model with a max likelihood estimation method, aggregated data give exactly the same coefficient estimates as the conditional logit model when the only data available are the choice characteristics (i.e. how many people chose what). Multinomial logit will be better when there are accompanying individual/group-level characteristics.

Gravity Models

Choices can also be modeled as flows between origins and destinations. This is widely applied in the fields of trade, migration and commuting.  A flow from place j to k can be modeled as:

Ln(njk) = βXjk + αj + αk + εjk

where the alphas represent characteristics of the source and destination such as population, wages etc., a cost of moving measure can also be included. This literature has found strong distance decay effects, which are puzzling in many cases (e.g. trade) as the cost of moving goods further is now fairly marginal.

Discrete v Aggregate: discrete choice models have the advantage that firm level characteristics can be incorporated, and there is a strong theoretical model underlying the estimations. Aggregate flows on the other hand are easier to compute and there is no need to make assumptions about the functional form that are necessary for the non-linear maximum likelihood estimators. One disadvantage is that no separation of the individual/aggregate factors is possible.



Notes from lecture and various articles 


Generally there is very little reason to suppose that a process will be generated randomly over space. Spatial statistics help us to gauge to what extent the values that data take are related to other observations in the vicinity. 

Spatial statistics broadly fall into two categories:

1)     Global – these allow us to evaluate if there are spatial patterns in the data (clusters)

2)     Local – these allow us to evaluate where these spatial patterns are generated

Differences between these two statistics a can be summarized thus:



Single Values Multi-valued
Assumed invariant over space Variant over space
Non-mappable Mappable
Used to search for regularities Used to search for irregularities
Aspatial Spatial

Generally these statistics are based upon:

  1. Local means – see spatial weighting sections above (smoothing techniques such as kernel regression and interpolation).
  2. Covariance methods – comparing the covariances of neighbourhood variables (Moran’s I, and LISA)
  3. Density methods – the closeness of data points (Ripley’s K, Duranton & Overman’s K-density).

Moran’s I

This is one of the most frequently encountered measures of global association. It is based on the covariance between deviations from the global mean between a data point and its neighbours (howsoever defined – e.g. queen’s/rook’s contiguity at the first/second order etc.).

It is computed in the following way:  

Where there are n data values, y is the outcome variable at location i or its neighbour j, the global mean is Yg and the proximity between locations i and j are given the weights Wij.

A z statistic can be calculated in order to assess the significance of the Moral I estimate (compared in the usual way to a critical value e.g. 1.95 for 5% significance). 

Problems with this measure are that it assumes constant variation over space. This may mask a significant amount of heterogeneity in spatial patterns, and it does not allow for local instability of variation. Thus a focus on local patterns of spatial association may be more appropriate. This could involve a decomposition of this type of global indicator in the contribution of each individual observation. One further issue is that the problems associated with MAUP (see above summaries) are built into the Moran statistic.

Local Moran

The Local Moran is a Local Indicator of Spatial Association (LISA) as defined by Anselin (1995). He posits two requirements for a statistic to be considered a LISA:

  1. The LISA for each observation gives an indication of the extent of spatial clustering of similar values around that observation.
  2. The sum of the LISAs for all observations is proportional to a global indicator of spatial association.

The local Moran statistic allows us to identify locations where clustering is significant. It may turn out to be similar to the global statistic, but it is equally possible that the local pattern is an aberration in which case the global statistic would not have identified it.

It is calculated like this:

Ii = Zi [∑j=1nWijZj, j=i]


where z are the deviations of observation i or j from the global mean, and w is the weighting system. If I is positive then the location in question has similarly high (low) values as its neighbours, thus forming a cluster.

This statistic can be plotted on the y axis, with the individual observation on the x axis, to investigate outliers, and see whether there is dispersion or clustering.


There are problems with this measure. Firstly the local Moran will be correlated between two locations as the share common elements (neighbours) Due to this problem the usual interpretation of significance will be flawed, hence there is the need for a Bonferroni correction which will correct the significance values (thus reducing the probability of a type I error – wrongly rejecting the null of no clustering). MAUP is an issue similarly as above.

Point Pattern Analysis

This type of analysis looks for patterns in the location of events. This is related to the above techniques, although they are based on aggregated data of which points are the underlying observations. As the analysis is based on disaggregated points, there is no concern about MAUP driving the results.

Ripley’s K

This method counts a firm or other observation’s number of neighbours within a given distance and calculates the average number of neighbours of every firm at every distance – thus a single statistic is calculated for each specified distance. The benchmark test is to look for CSR (complete spatial randomness) which states that observations are located in any place with the same constant probability, and they are so located independently of the location of other observations. This implies a homogenous expected density of points in every part of the territory under examination.

Essentially a circle of given distance (bandwidth) is centred on an observation, and the K statistic is calculated based on all other points that are located within that circle using the following formula:

K(d) = α/n2 * ∑i=n i=1i≠j I{distanceij < d 

where alpha is the area of the study zone (πr2), and I is the count of the points that satisfy the Euclidean distance restriction. If there is an average density of points µ, then the expected number of points in a circle of radius r, is µπr2. As the K statistic is the average number of neighbours divided by the expected number of points µ, this means that CSR leads to K(r) = πr2.

Again, the returned density by distance can be plotted against the uniform distribution to see whether observations are clustered or dispersed relative to CSR.

Marcon and Puech (2003) outline some issues with this measure. Firstly, since the distribution of K is unknown, the variance cannot be evaluated, which necessitates using the Monte Carlo simulation method for constructing confidence intervals. Secondly there are issues at the boundaries of the area studied, as part of the circle will fall outside the boundary (and hence be empty) which may lead to an underestimation at that point. This can be partially corrected for by using only the part of the circle’s area that is under study.

Additionally, CSR is a particularly useful null hypothesis, other benchmarks may be preferable.

Kernel Density

These measures yield local estimates of intensity at a specified point in the study. The most basic form centres a circle on the data point, calculates the number of points in the area and divides by the area of the circle. i.e:

δ(s) = N(C(s, r)/ πr2

where s is the individual observation, N is the number of points within a circle of radius r. The problem with this estimate is that the r is arbitrary, but more seriously, small movements of the circle will cause data points to jump in and out of the estimate which can create discontinuities. One way to improve on this therefore is to specify some weighting scheme where points closer to the centroid contribute more to the calculation than those further away. This type of estimation is called the kernel intensity estimate:

δ(s) = ∑i=n, i=1 1/h2 * k(s – sj / h)


where h is the bandwidth (wider makes estimate more precise, but introduces bias) and K is the kernel weighting function.





There is macro evidence that property rights are important for economic outcomes such as income and growth. However, macro studies find it hard to conclusively prove that causality runs from rights to economic outcomes. Instrumental variable analysis is seductive, however, in the case of case of settler mortality in the work or AJR, this might just be capturing levels of human capital, and in general there may be other correlated unobservable (such as other institutions). Additionally there are issues relating to the comparability of institutions across countries. Micro evidence aims to explore exogenous variation in rights within countries or regions in order to offer stronger evidence for the flow of causality. The downside to micro studies is of course that they are heavily context dependent and the results may not be generalizable to other settings.

There are several channels through which property rights might affect incomes and growth.

  1. Investment – Individuals do not invest if the fruits of their investment can be stolen or expropriated. Expropriation acts like a random tax on investment and so individuals will underinvest.
  2. Collateral – If property rights enable land to be used as collateral for access to formal credit, then land rights may increase investment in physical/human capital by lowering the marginal cost of capital. People invest until the marginal return to investment equals the margin cost (interest rate) and if being able to provide collateral lowers the interest rate, people will invest more.
  3. Less time defending land – if people with insecure title are compelled to spend time at home in order to protect themselves from eviction or theft then this can reduce the amount of labour supplied to the market, decreasing aggregate output and personal income.
  4. Incorporates people into the citizenry – having property rights might change people’s beliefs about society and hence encourage them to participate more directly with formal labour markets, public goods, political processes etc.

Could they be bad? Yes, as Besley notes if individuals care equally about every member of the community, and the land is due to revert to the community after death/transfer, then this may have no effect on the incentive to invest. Similarly, if consumption is at community level then this need not disincentivize investment. However, in this case if there are significant externalities from investment then having property rights might actually reduce efficiency (e.g. if irrigation takes water away from other farmers in the community), if those externalities cannot be internalized. The problem comes when there is a lack of harmony between the formal system of land holding, and the decisions individuals make, for example in Africa where consumption is at the individual level, but land is held at community level.

Additionally, if property rights are provided by the central government they may crowd out local institutions.



Entitled to Work: Urban Property Rights and Labour Supply in Peru E. Field (Quarterly Journal of Economics 2007)


In a Nutshell

Land titling may affect investment incentives and credit access. This is well documented. However, the alternative (complementary) channel examined here is the effect that property rights can have on labour supply by transferring the role of property protection from the individual/community to the state. The idea is that as a consequence of titling, individuals have time freed up which they previously devoted to solidifying informal claims as there is increased ownership security. The paper uses data from Peru which saw a huge tilting programme whereby 1.2m households were given title where they previously had none. Using the fact that the programme was staggered allows for a comparison of households in neighborhoods that had already been targeted with those that had not in a difference in difference estimation.

In Peru there were wide reports of community organizations that protected property rights in urban squatter settlements. Participation in these organizations could substantially have hindered labour market opportunities. Assuming that there is incomplete substitution between the individual protecting his own home and hiring someone else to do it (due to income constraints, and lack of social capital – trust), this implies that strengthening formal rights decreases the need for households to spend time on home protection thus decreasing the amount of work undertaken at home, and increasing the amount of labour supplied to the market. This effect will be decreasing in the level of informal rights the household has (as measured by the length of tenure) and the size of the household (as chances are that in larger households someone will be home irrespective of the need to provide property protection). Additionally it could affect child labour if children act as a substitute for adult workers and go out to work (as they are unable to protect the home).

The results indicate that households with no title spend 13.4 hours per week maintaining informal tenure reflecting a 14% reduction in total household work hours for the typical squatter family. Household members are 40% more likely to work inside their own home. The effect of the titling programme was that 16 extra hours were worked per week for those reached by the programme, and they were half as likely to be found working at home. This effect is decreasing in informal tenure and family size, as predicted.

There are a couple of caveats here. Firstly, it is not possible to state conclusively that the mechanism is the freeing up of time previously devoted to protection. Whilst this seems likely, it is also possible that the marginal utility of labour increased as people felt more secure in investing in their domestic infrastructure. Either way the labour supply increased though. Secondly, this is a study of the urban environment. Given that in rural settings most people are working their own land (agriculture) there is presumably a much lesser degree of tradeoff between labour and protection. Thus we would not expect to see the same results (for the same reason at least) in rural settings. Lastly, Peru is quite a specific context, it may be that in other regions there is less community policing, or less threat of eviction. Additionally, if informal land holders increase their informal tenure by investing in the land then there may be regions where this past investment now acts as de facto property rights, and a formal title might actually make little difference.



Property Rights and Finance Johnson et al. (The American Economic Review Vol. 92, No. 5 2002 pp. 1335-56)


In a Nutshell

This paper builds on the Besley paper summarized elsewhere this week. He finds a significant link between property rights and investment. This paper asks whether in addition to secure property rights, the availability of external finance is necessary for entrepreneurs to invest. Looking at the Eastern European countries that share similar institutional environments but different levels of property rights, they survey firms about their perceived property rights and use this to evaluate investment decisions measured as how much of a firm’s profits are reinvested.

They find a robust correlation between the amount a firm chooses to invest and their measure of property rights regardless of the ability of those firms to access external financing. This indicates that at low levels of development property rights are a necessary and sufficient condition for investment. This might indicate that financial development need come only after the securing of property rights.

There are lots and lots of issues with this paper. Firstly they only survey existing firms, so the results say nothing about the interaction between property rights and access to finance as it applies to entrepreneurs. Secondly the measure of property rights is quite bizarre, and is more related to corruption than property rights, and whilst the two might be related they are by no means synonymous.  Additionally, the firms in question typically had very high levels of retained earnings, and as such they did not need to rely on external financing, which might indicate why financing showed no effect upon reinvestment.



The Formation of Beliefs: Evidence from the Allocation of Land Titles to Squatters Di Tella et al. (QJE 2007)


This paper uses evidence from Argentina whereby by a quirk of the law some squatter in a community were given land titles, and others were not. In subsequent surveys they found that the beliefs of those with title were much more aligned with what might be called market principals of individualism etc. Given the close proximity and shared history of the squatters and the exogenous change in land title, the authors ascribe this change in beliefs to the holding of property rights. In other words there may be some psychological benefits from land rights that inspire more interaction with the economy.



D. Almond

Journal of Political Economy, Vol. 114, No. 4 (2005) pp. 672-712

A Short Summary 

In a Nutshell

According to fetal origins hypothesis. Many health conditions that occur in an individual’s lifetime can be traced back to the course of fetal development. This could have serious economic consequences and hence indicate policy that might be used to combat poor pre-birth conditions in order to improve economic outcomes and aggregate health. In order to evaluate these claims a unique natural experiment is analyzed: the 1918 Spanish flue pandemic in the US.

This pandemic struck in Oct 1918 and was over by beginning of 1919 implying that cohorts born just months apart experienced very different in utero conditions. Different states were affected differently. Exploiting this variation in a RDD and DID design (comparing cohorts by birthdate, and comparing cohorts by state) using census data the study finds that virtually all examined socio-economic outcomes were affected. Children of exposed mothers were 15% less likely to graduate from high school, wages were 5-9% lower for men and likelihood of being poor rose 15%.

The responsiveness of labour market outcomes to fetal health has significant implications for health economics – i.e. policies to improve fetal health may have a multiplier effect to any policies that seek to improve the educational system.





Large differentials in health outcomes of mask significant variation within country across different groups (men/women, different ethnicities). The key question is whether improving access to health facilities for certain (otherwise excluded) sections of society could significantly improve aggregate health outcomes? If so, then this could be significantly cheaper than alternative policies that seek to expand health service provision, as encouraging use does not necessarily involve spending on new health infrastructure.

A key part of this problem is understanding why certain groups do not have access to health facilities. Is there active discrimination, in which case some role for anti-discrimination policy could be of use, or is the discrimination more passive (i.e. based on cultural norms/preferences) in which case policy may have to play a more indirect role in encouraging participation?


Civil Rights, the War on Poverty, and Black-White Convergence in Infant Mortality in the Rural South and Mississippi D. Almond et al (MIT Working Paper 07-04)

In a Nutshell

This paper provides a good example of the role that policy can play in encouraging participation with health services through decreasing active discrimination, and also the health benefits of including an otherwise excluded section of society as beneficiaries of health facilities.

In the 20th Century there was a marked improvement in the infant mortality rates of black infants in the rural South. The paper argues that this was driven by federally mandated desegregation of hospital facilities that had the effect of increasing access to hospital care for black babies. The policies which were part of the Civil Rights act effectively opened up what had previously been white only hospitals. They present quantitative evidence to support this assertion. Firstly, the reduction in black infant mortality began immediately after integration and was the most pronounced in the rural South, where access to hospital was most constrained for black families. Secondly, the decline was driven by declines in post-neonatal rather than neonatal deaths (as post-neonatal were more preventable than neonatal at that time). They also use Mississippi as a testing ground, as there was significant variation in when hospitals desegregated – they strong effects in mortality reductions when a hospital in the county was certified as desegregated, relative to those that were not yet desegregated. This evidence refutes any alternative hypotheses based upon improvements in medical care.

They estimate that over 25,000 deaths were prevented with a welfare contribution of c.$7bn. In other words, through a simple mechanism of anti-discrimination, a large section of society was now able to use the health service and this had significant health impacts without the need to invest heavily in new infrastructure.


Missing Women: Age and Disease S. Anderson & D. Ray (Review of Economic Studies (2010)

In a Nutshell

In many parts of the world (China/India especially) it has been noted that the ratio of women to men is suspiciously low. Amartya Sen calculated that had the ratio been the same as in the world as a whole, there are millions of “missing women”. This has often been attributed to selective abortion (due to a preference for males) and a systematically lesser degree of care for girls relative to boys.

This paper performs an accounting exercise to see at what point in the age distribution these women are missing in different regions of the world, and investigates what could be causing this. They do so by comparing death rates in the specific country to the death rates observed in the developed world whilst controlling for the sex ratio at birth (as different ethnicities have different birth rates for boys/girls), and controlling for different disease compositions (that may differential affect the sexes).

Their findings indicate that whilst in India and China there are similar overall imbalances in the sex ratio, there are distinct age profiles of these missing women. Sub-Saharan Africa has exhibited birth rates very similar to the developed world. However, when the natural birth rate is controlled for, Africa has relatively more missing women than either India or China. SSA’s missing women are not however missing at birth, which is confirmed when the age clusters of missing women are analyzed. India on the other hand has 11% of its missing females at birth, and China has c.40% which indicates there is selective abortion practices occurring in China and to a lesser degree India.

They find that the changing composition of the disease profile explains very little of the variation in missing women. In India preventable diseases explains missing girls in childhood, maternal mortality and injuries kill women of reproductive age, and cardiovascular mortality explains death at older ages (which is actually the largest component of the missing women in India). In SSA the dominant source of missing women is HIV/AIDS which may reflect differential treatment received by women, or prevalence of sexual violence among other possible explanations.  In China, other than the prenatal missing women, women over 45 seem to be missing too.

This analysis cannot disentangle whether active or passive discrimination is occurring, or whether it is another mechanism, but it helps in knowing where to look for effects. For example how the elderly receive care in India and China is clearly an issue, as are termination practices in China.

To the extent that active discrimination is occurring the Almond et. al paper shows that there could be a positive role for anti-discrimination policies, and that such policies can lead to better aggregate health outcomes.


Why do Mothers Breastfeed Girls Less than Boys? Evidence and Implications for Child Health in India S. Jayachandran & I. Kuziemko (Quarterly Journal of Economics)

In a Nutshell

This paper is about passive discrimination. It is also concerned with differential health effects for girls as opposed to boys, but proposes that the mechanism at work is that girls receive less time breastfeeding than boys, and the reason for this is that there is a cultural preference for boys. Since breastfeeding reduces fertility, then mothers of girls are likely to breastfeed their female children less if they still desire to have a male child. To the extent that there is a “stop-after-a-son” fertility pattern, when a daughter is born the parents will likely want to try again (and hence she will stop breastfeeding) indicating that girls will be weaned earlier than boys. Given the large health benefits of breastfeeding, particularly in the presence of widely contaminated water and food, such an attitude can lead to disparities in child health between the sexes. Note, this is not active preferential treatment for boys over girls, but rather a prediction that girls will be breastfed less even when parents value equally the health of all their existing children.

The predictions are borne out by the data. Breastfeeding duration increases with birth order as the demands for the contraceptive element of breastfeeding increases. Overall girls are breastfed less than boys. Children with older brothers are breastfed more. The gender effect is largest as the family size approaches the (self-reported) target family size.

Back of the envelope calculations show that breastfeeding could account for between 8000 and 25,000 missing girls per year in India.

Policy wise this is more difficult than the active discrimination case. Any breastfeeding awareness campaign could be offset by the preference for boys, a norm which in itself will be hard to change using public policy. So some indirect options are available:

Firstly contraceptive could be promoted. However, this has ambiguous effects. Contraception may crowd out breastfeeding inasmuch as mothers rely on breastfeeding more (for its contraceptive properties) when other contraception is unavailable – thus promoting contraception may decrease breastfeeding. Alternatively, if access to modern contraception better allows for family planning and particularly the timing of birth, then this may encourage breastfeeding.

Secondly, water and sanitation should be improved such that when children of any sex are weaned off breastmilk they have a better chance of survival due to clean water etc.



Missing Women and the Price of Tea in China: The Effect of Sex-Specific Earnings on Sex Imbalance N. Qian (Quarterly Journal of Economics 2008)


In a Nutshell

Amartya Sen once theorized that the reason that the sex ratio was much more balanced in Africa was that women were more integrated into the labour force, and thus the value of a female life was greater than in parts of the world where women were excluded from the labour force. This paper does not quote him directly, but it is investigating this mechanism in China. In other words it investigates whether changes in relative female income (as a share of total household income) affects life outcomes for boys and girls. Previous studies suffered endogeneity problems as areas where the female component of the workforce earned more money may have been areas where women had higher status already. In order to get around this problem, the paper uses quasi-experimental data based upon two reforms in post-Mao China which increased the price of cash crops including tea and orchards. Women have a comparative advantage in tea (due to the delicate nature of the work and the low lying bushes) and men have the advantage in orchards. This meant that areas that cultivated tea experienced an increase in female income and so on meaning that a difference in difference strategy can be used to identify the effect of rising income on survival. The setting is advantageous as migration was strictly controlled, there was little technological change in the period, and sex-revealing pre-birth technologies were not widely available (thus ruling out certain confounding elements).

She compares the sex-imbalance for cohorts born before and after the reforms with counties that plant sex specific crops as the treatment and counties that do not as the control. Firstly she compares the sex ratio in counties that plant tea to counties that do not between cohorts born before and after the reform (thus effectively holding male income constant), and does the same for orchards (holding female income constant). She repeats this analysis for educational attainment. The results indicate that increasing female income by 10% increases the fraction of surviving girls by 1% and educational attainment for boys and girls by 0.5 years. Increasing male income by the same amount decreased survival rate for boys and girls and had no effect on educational attainment for boys.

This is a special kind of difference in difference (bit sketchy on the details). Comparing sex imbalance within counties between cohorts removes time-invariant community characteristics (fixed effects) whereas comparing sex imbalance within cohorts between tea-planting and non-tea planting counties removes changes over time that affect the regions similarly.  As she has to use 1997 agricultural data on what crops were planted this introduces measurement error and attenuation bias. Similarly there could be endogeneity if families that prefer girls switch to tea planting after the reform. To counter these issues she does an IV strategy also which used slope as an instrument for planting tea.

The identifying assumption is the usual DID one. This is not reliant on the fact that only women pick tea. In fact, as tea is a proxy for female income, if men or children pick tea then the proxy would actually exceed real female income so the strategy would underestimate the true effect of female income on the sex ratio. She provides some graphic evidence that there is a trend break around the time of the reform, and that there were some parallel trends.

How might increasing female income increase survival rates of girls?

  1. Increase parental perceptions of future earnings potential of girls and hence increase their relative desirability
  2. Increase in total household income may increase desirability of girls relative to boys if for some reason daughter are luxury goods relative to sons
  3. Increasing female specific income can increase female bargaining power, and this will increase survival of girls if mothers prefer girls more than fathers
  4. Increasing the value of adult female labour can raise the cost of sex selection since pregnancies must be carried to term before the sex of the child is revealed.

Which mechanism is at work is partly a function of how the household behaves. If the household is unitary (income –pooling), it makes no difference whose income is raised, it will have an equal effect on household consumption. However, this can be ruled out as there are differences between the effects of raising tea income as opposed to orchard income.  This points to a model of intra-household bargaining where mothers value education more than fathers and face higher costs of neglecting the children of either sex which will lead to equal treatment of boys and girls which is why we see increased education for both, and improved survival rates for girls.

Policy implications are pretty clear. One way to increase female survival rates and educational attainment for all is to increase the income of women.



E. Miguel, S. Satyanath & E. Sergenti

Journal of Political Economy Vol. 122, No. 4 (2004) pp. 725-53

Principal Research Question and Key Result Do economic shocks increase the incidence of civil conflict in sub-Saharan Africa? A 5% drop in economic growth in the previous year is associated with an increased probability of 12% of a civil conflict (at least 25 dead) the following year which is a more than one half increase in likelihood.
Theory Collier and Hoeffler claim that the gap between the returns to economic activities relative to the taking up of arms is what causes low incomes to be a determinant of civil conflict. Others argue that low incomes mean that military and transport infrastructure is low, and this means that governments are less able to repress insurgents, and this is the mechanism at work.


Motivation To address the endogeneity problems in much of the recent research that has linked economic conditions and civil conflict by using instrumental variables. Previous research was aware of endogeneity concerns and tried to solve them by using lagged right hand side variables. However, this approach assumed that economic actors did not anticipate the incidence of civil conflict and adjust their behavior accordingly, which is a very strong assumption. This paper makes the first attempt to find a decent instrument. A further benefit of the approach is that they are able to deal with the measurement error in reported national income figures that come out of Africa.


Data Armed conflict data, based on minimum 25 deaths per year from infraction involving armed force between government and other parties. They also use an alternate measure of 1000 deaths from another data source.

Rainfall data are monthly estimates for various points within the country taken from the Global Precipitation Climatology Project. The principal measure of rainfall shock in the proportional change in rainfall from the previous year.

Strategy They instrument per capita economic growth in the first stage with current and lagged rainfall growth along with other country characteristics.

In the second stage they include country fixed effects in some specifications.

Weather shocks are plausible instruments for economic outcomes in economies that are agriculture dependent and are largely not irrigated, as in the case of SSA.

In the Second stage they use instrumented values of growth in the current period and the previous period


Results In the reduced form higher rainfall is associated significantly with less conflict (for both 25 and 1000 deaths).


The “SLS estimate with controls for ethnolinguistic and religious, and oil exporting, population etc. etc. is significant and negative for lagged growth, but not current growth (although they are jointly significant at the 90% level). The other controls which have been suggested by the recent conflict literature are small and insignificant, indicating that incidence of civil wars is influenced by economic shocks rather than other political style determinants. In fact a 1% decline in GDP growth is associated with a 2% rise in the chance of conflict. That the results are so much bigger than the OLS results indicates the problems of measurement error associated with using African national income data. These results hold for the 25 and 1000 death definitions (although smaller for 1000 death – and current growth is more important than lagged growth for the 1000 death data).  NB this is about growth, not absolute levels of GDP. The level measure does not come out significant.


They interact econ growth with democracy and other potential determinants and find no significant relationships. In other words there is little heterogeneity of effects across SSA, and countries are not differentially affected based on their institutions, or ethnic makeup, or oil exporting…. Etc. etc. This would seem to indicate that economic concerns trump other factors in determining the incidence of civil war – social and institutional factors seem to be of little importance (however, this could also be being driven by limited variation in those other variables).


They restrict the sample to cases of conflict where there had been no conflict the year previously to see how growth affects the onset of conflict and get similar results.

Robustness Robust to dropping one country at a time.

They use alternative measures of rainfall.

They investigate potential violations of the exclusion restriction. 1. Rainfall affects government budgets and spending through taxation – not the case, there is no association in the data between rainfall and tax revenues. 2. High rainfall may destroy roads etc. that makes it more costly for the government to repress insurgents – this flows the wrong way as the results show that more rainfall is associated with less conflict. Perhaps this same mechanism makes it harder for people to engage in conflict, but there is no association between rainfall and the extent of usable road in the country.

  • The paper focusses on short term triggers not long term determinants.
  • External validity low, and method not applicable elsewhere where less agriculturally dependent.
  • The paper cannot identify the mechanism at work. Whilst they claim that the results are consistent both with the weak state (as background conditions) and the opportunity costs (that trigger the conflict), they cannot disentangle the effects. For that we need the Columbia paper summarized above. They do not have reliable data on inequality, as this is another potential mechanism (heightens tensions across nations), so they cannot rule this mechanism out either, although they do test with proxies for inequality and do not find any compelling associations. 
  • There is no sub national level data, and as rainfall is at a very specific location and has to be aggregated up to national level there could be spurious correlation as rainfall could be falling in one region and the conflict going on in a totally different region unaffected by the weather/income shock. This is unfortunate.
  • Much violence in SSA does not involve the state, but other parties, and this will not be captured by the Armed Conflict Data.
  • The rainfall instrument is surprisingly weak, with an F-Stat of only 4.5. This can cause problems for efficiency but also consistency is there is any measurement error. They do a false experiment whereby they use future rainfall in the first stage and find no relationship at all which is encouraging.
Implications Economic variables are more important determinants of civil war than measures of objective political grievances. This could indicate that a way to reduce the incidence of conflict is to better able individuals to smooth away weather related income shocks.  This may be possible using informal institutions at the village level (Townsend) but if the weather shock is aggregate chances are that there will be a lack of insurance as there is little evidence of across village insurance, and even if it existed the shock may be so aggregate that the insurance mechanism does not function. In this event formal state sponsored insurance, or income transfers should be made available conditional upon remaining in agriculture, such that the opportunity cost of working in agriculture does not get so high that people are incentivized to take up arms.






C. Criscuolo et. Al

A Short Summary 

In a Nutshell

Most governments have industrial policies that claim to foster productivity and employment etc. but it is not possible to tell if they are just financing activities that would have been undertaken regardless of state funding. Comparison of outcomes with other firms does not work as a counterfactual as the sorting into firms that receive funding is far from random. This paper uses data from a quasi-experiment where changes in the eligibility criteria for assistance for the UK Regional Selective Assistance programme caused a change in the areas that were eligible to receive funds. Using RSA data they find that there is a large effect on the treated for employment, investment and probability of exit, and these effects are seriously underestimated if endogeneity is ignored. They cannot however rule out the possibility that there are negative aggregate productivity effects from protecting inefficient incumbents 

Further Details

The EU changed the eligibility rules such that areas were subject to different constraints in terms of how much funding they could receive from the government. Generally applied to manufacturing firms who needed funds for capex to create jobs, in a viable project setting. The applicant had to demonstrate need, and should be meeting the other expenses himself or otherwise from the private sector.


Yjt = αDjt + βXjt + ηj + τt + vjt  (4)

Due to data limitations, they aggregate across all plants in the same firm, and run above regression at the firm level. Note however that they use plant level data in order to later analyse area-level impact of industrial policy (thus capturing general equilibrium effects).

yjt is outcome of interest for firm j at time t. Authors consider three outcome variables: employment, investment and productivity. Xjt are covariates used as controls that vary depending on outcome of interest.

Djt is participation dummy– authors mainly use binary indicator to reflect if the firm received any treatment.

Instrument for Djt with Zjt , which is the level of the maximum investment subsidy (Net Grant Equivalent/NGE) available in the area where the firm’s oldest plant is (oldest as then the location decision will not have been made because of changes to the EU assistance map). Baseline results use mutually exclusive dummies for each of the different rates.

They instrument for participation in the programme using the changes to the system imposed by the EU.

The data set is a panel that combines admin data on the RSA participants, and matching them to a firm level database that gives employment, investment and entry/exit information. This means they can track firms before and after participation and compare them to a control group that did not participate.


OLS results indicate a 37% increase in employment rising sensibly with the level of subsidy. The IV result is much larger suggesting serious downward bias in the OLS estimates. This is reduced to a much more sensible level when firm level dummies are included to control for heterogeneity of response to the treatment, but they are sensible and much larger than the OLS estimates with the same dummies included.

They do the same for labour productivity (measured as ration of gross output to employment) but the effects are small and insignificant.

They find a positive effect on employment at the area level (based on travel to work area) which indicates that there are spillovers from the RSA, and there is also a rise in the number of plants in an area. However, the employment effects on incumbents are stronger than an incentive to new entry as RSA dampens reallocation effects (due to less exit) and as there is no productivity effect from receiving RSA, it would appear that the scheme supports less productive firms and this could dampen aggregate productivity (especially since in the summary stats they show that the firms that receive RSA tend to be larger than the control firms).


Positive effect of programme on investment and employment but not on productivity. As the RSA helps firms to expand this could have a negative impact on aggregate productivity. Seems to be more characteristic of a welfare payment.




M. Greenstone, R. Hornbeck & E. Moretti


Principal Research Question and Key Result Are there economic spillovers that accrue to incumbent plants from agglomeration and through what mechanisms do such benefits arise? The paper finds that there are positive spillover effect on total factor productivity in incumbent firms that persist to be 12% for firms 5 years after the location of a “million dollar plant” (MDP) in the vicinity. This productivity gain seems to occur only for plants that are close in economic distance, meaning that they share worker flows, and they employ similar technologies. This is consistent with a labour force and knowledge spillover. There is no evidence for input/output based spillovers (see theory). 
Theory The entry of a new firm creates spillovers, and this leads to the entry of firms that want to benefit from these spillovers. This leads to competition for inputs and hence labour/land/other local input values rise. This continues until the value of the increased output is equal to the increased cost of production (as firms are assumed to be price takers and so cannot raise prices). This simple model yields four testable hypotheses:

  1. The opening of a new plant increases TFP of incumbent plants
  2. The increase may be larger for firms that are economically closer to the new plant
  3. Density will increase as new firms move in to take advantage of the spillovers created by the new plant (if they are large enough)
  4. The price of locally supplied factors of production will increase.

What are the possible channels for these spillover effects?

  1. Labour market – the labour market is “thicker” thanks to agglomeration. This can reduce search frictions and improve the match between worker and firm (this implies increased productivity). Alternatively/complimentarily the availability of a larger local workforce reduces the possibility that there are positions unfilled (this does not necessarily imply improved productivity, only that posts will be filled).
  2. Transportation costs – transportation costs for local suppliers of intermediate inputs and services will be lower if there is agglomeration and reduces production costs in dense areas.
  3. Knowledge spillovers – the sharing of knowledge and skills through formal and informal interactions may generate production externalities across workers. This may be important particularly in hi-tech areas (think Silicon Valley). This implies increased productivity. However, it is not clear who will gain from this productivity, as the knowledge spillovers may lead to increased investment in new technologies in which case the benefit will accrue to capital, or it may increase worker productivity in which case labour wages will gain.
  4. Amenities – local amenities are valued differently between different types of worker and firms need to be located such as to attract the right kind of worker. This implies no productivity differences between high/low density areas once type of worker is controlled for.
  5. Natural advantages – oil companies near oil fields, wine makers in the Loire. Since most natural advantages are fixed over time this is not relevant for empirical analysis which looks at changes in agglomeration over time.


Motivation Increasingly, local governments compete for big plants to locate in their region by offering generous subsidies. The main economic rationale for doing so is that these plants create agglomeration spillovers which benefit the local economy. Yet we lack any rigorous testing of these effects, and if they are in fact found to be small, then this questions the use of taxpayers’ money being used to finance such subsidies. 
Data They identify 47 usable MDP openings. They combine this with information on incumbent plants in the winning/losing county (of which there had to be at least one pair of incumbents for the MDP to qualify) including capital stocks, materials, value of shipments, etc. from annual manufacturing surveys. The focus on existing plants eliminates problems of endogenous openings of new plants.In order to investigate mechanisms they code variables relating to % of output sold to manufacturers, % of inputs from same three digit industry, labour market transitions between industries, % of patents manufactured in each three digit industry, R&D expenditure.


Strategy Firms do not randomly locate, therefore simple comparison of regions is not appropriate for obvious reasons of endogeneity. So the authors use an industry property source to compare regions that did get MDPs with the regions that narrowly lost out to winning those same plants, on the assumption that those two regions are significantly similar in TFP enough to allow for use of the “loser” and control for the “winner” – formally the assumption is that but for the location of the MDP the TFP in the winner and loser regions would have developed identically. The summary statistics show that observables are generally balanced between winning and losing counties relative to the rest of the USA, and even more balanced as between firms in winning/losing counties indicating that the identifying assumption is quite strong. The authors state that even if it is not perfect, this is still a better method than simple comparison.The strategy is pretty extensive. The dependent variable is TFP defined as total value of shipments minus changes in inventory. The left hand side includes time trend dummies, a dummy equal to 1 if the plant is in a winning county and 0 otherwise. There is also a dummy that turns on when the MDP is open. Then there is:

α[(Winner)*(Open)] which is the difference in difference estimator, with alpha as the effect of being in a winning county in a time when a MDP has opened.

β[(Winner)*(Open)*(Trend)]  where Trend are the time dummies. Beta shows the differential effect being in a winning county with a MDP has on time trends beyond the year that the plant opens and the Open dummy turns on
They actually do two estimations, one which is the simple DID (which sets the Winner*Open*Trend coefficient to 0, and a full estimation where it is allowed to vary.  I.e. the specification allows for a mean shift and trend break.

They include plant and industry and region fixed effects.


Results Differences in TFP in the years before the opening of an MDP as between winning and losing counties are all small and insignificant. In the years following the opening coefficients on the Winner*Open year specific dummies become significant at the 5% level. This reveals a sharp upward break in the difference between the TFP of the counties (which is in fact a decline in the losing counties relative to a flattening out in the winning counties – showing the importance of the losing counties counterfactual, as if not included negligible results would have been found). The coefficients from suggest a mean shift of 4.8% in TFP in winning counties relative to losing counties. Moreover, when the second model is used that includes separate time dummies, the effect seems to be getting stronger as time passes, so having an MDP is associated with a 12% increase in TFP in winning counties in year 5 after opening which confirms the importance of including the trend break. These numbers translate into approx. $170m in year one and $430m increase in total output in year 1 and 5 respectively.Coefficient on the pre-opening trend is not significant which lends support to the identifying assumption.

There is significant heterogeneity, and in some specific cases there are negative effects of introducing a MDP.

The increase in TFP in incumbent plants seems to come from incumbent plants producing more with less after the MDP opening.

They investigate the mechanisms at work. When they compare results for incumbents in the same 2 digit industry the effects are much greater, and no significant effects are found for other two digit industries. The construct measures of PROXIMITY that captures worker flows, technological proximity, input-output flows, and re-estimate interacting it with OPEN*WINNER*PROXIMITY. All coefficients are significant except flow of goods/services. Thus there is evidence that intellectual or technological linkages, sharing workers increase the spillover – i.e. intellectual spillovers.

If these spillovers are big enough there should be new entrants, and indeed a DID with log(plants) as depvar is positive and significant – i.e. new economic activity was attracted. They also find that wages increase as demand for labour increases.


Robustness There could be unobserved productivity shocks coincidental to the opening of the MDP. They do a variety of specification checks; they allow inputs to be endogenous. Also since MDPs are associated with public good investment etc. this could be what is driving the results, but they find no meaningful relationship between government expenditure in the area and the plant opening.There could have been differential attrition, however, in the winning plants 72% remained at end of period, and this was 68% in losing plants i.e. very similar.


Problems The theoretically correct depvar is quantity, but this is not comparable across plants. So they have to use value measures. In that sense, the results could be reflecting some change in price rather than a change in actual productivity.Do the results hold for smaller plants and plants outside the manufacturing industry?


Implications There do seem to be externalities associated with the location of MDPs. However it is not clear that this is a justification for offering generous subsidies. In particular it may be the case that the plant will locate domestically no matter what, in which case subsidies are wasteful from a national perspective. Even at local level it is probable that all the gains will be bargained away, and hence they will be zero sum games.However, given that there is significant heterogeneity of effects there could still be a place for subsidies. For example as spillovers are greatest where there is economic proximity, MDPs should be encouraged to locate where the spillovers will be largest. The MDP may locate only where its profits will be highest (as it does not itself benefit from the spillovers it creates). In this case, some subsidy should be offered to internalize the benefit they provide and in so doing encourage them to locate where the spillovers will be greatest.



K. Muralidharan & V. Sundararaman

NBER Working Paper No. 15323 (2009) 

Principal Research Question and Key Result Does performance based pay for teachers improve student performance? In an experiment in India, students who had teachers subject to performance incentives performed between 0.28 and 0.16 standard deviations better than those in comparison schools.


Theory It is not clear that monetary incentives will always align the preferences of the principal and the agent. In some cases they may crowd out intrinsic motivation leading to inferior outcomes. Psychological literature indicates that if incentives are perceived by workers as a means of exercising control they will tend to reduce motivation, whereas if they are seen as reinforcing the norms of professional behaviour then this can enhance intrinsic motivation.

Additionally whether incentives are at a class or school level will be of importance. This is because in the school results model (how schools perform on aggregate) there will be incentives to free ride. This is not the case if incentives operate at the individual teacher level. The problem may be reduced in small schools where teachers are better able to monitor each other’s efforts at a relatively low cost.


Motivation There are generally two lines of thought regarding how to improve school quality. The first argues that increase inputs are needed. This might include text books, extra teachers, better facilities etc. The other option is to implement incentive based policies to improve existing infrastructure, and perhaps improve individual selection into the teaching sector.


Experiment/Data The experiment took place in Andhra Pradesh which has been part of the Education for All campaign in India, but sees absence rates of around 25% and low student level outcomes. There were 100 control schools, 100 group bonus schools (all teachers received same bonus based on average performance of the school), and 100 individual bonus schools (incentive based on performance of students of a particular teacher). Focussing on average scores ensures that teachers do not just focus on getting those kids near the threshold up, thus excluding less able children. No student is likely to be wholly excluded given the focus on averages. Additionally, there was no incentive to cheat, as children that took the baseline test, but not the end of year test were assigned a grade of 0 which would reduce the average of the class.

A test was administered at the start of the programme/school year which covered material from the previous school year. Then at the end of the programme a similar test was given, with similar content, and then a further test which examined the material from the current school year (that they have just completed). The same procedure was done at the end of the second year. Having overlap in the exams means that day specific measurement error is reduced. The tests included mechanical and conceptual questions.



Tijkm(Yn) = α + β[Tijkn(Y0)] + γ(Incentives) + δ(Zm) +εk + εjk+ εijk 

T is the test score, where i j k m indicate student, grade, school, and mandal (region) respectively. Y0 indicates baseline tests, and Yn indicates the end of year tests. The baseline results are included to improve efficiency by controlling for autocorrelation between the test scores across multiple years. Zm is a vector of mandal dummies (fixed effects) and standard errors are clustered at the school level.  Delta is the coefficient of interest.


Results Students in incentive schools scored 0.15 standard deviations higher than the comparison schools at the end of the first year and 0.22 at the end of the second. This averages across maths and languages (although disaggregated the effect for maths was higher). NB Whilst the year 1 to year 0 comparison is valid, and the year 2 to year 0 is valid as well, technically the comparison of year 2 to year 1 (column three of table II) is not experimental estimation as year 1 results are already post experimental outcomes.

They examine for heterogeneous treatment effects by including relevant variables and interacting them with the INCENTIVE dummy., and find that none of them (no. students/school proximity/school infrastructure/parental literacy/caste/sex/) see differential effects from the programme indicating that the benefits are widely based and not conditional on a set of predetermined characteristics. The only interaction for which there is a mall effect, is household affluence. These then are broad-based gains. As the variance of test scores in individual school went up, this might indicate that teachers responded differently, as it seems there were no barriers for all types of children and schools to benefit from the programme (no heterogeneous effects).

When they include teacher characteristics such as education and training, the see no significant effect, but when they interact these measures with the INCENTIVES dummy they are positive and significant, indicating that high quality teachers alone may not sufficient if they are not incentivized to use their skills to maximum effect.

Teachers that were paid more responded less, presumably as they are more experienced (less conducive to change) and the bonus represented a smaller fraction of their total income.

Happily the results were similar for both the conceptual and mechanical questions, indicating that real learning is taking place, rather than just rote reproduction. Additionally students in incentive schools performed better in non-incentive subjects like science. NB it is possible that teachers diverted energy from teaching non-incentive subjects to teaching incentive subjects for obvious reasons. This result does not disprove that, but it says that in the context studied improvement in teaching in certain subjects can have spillovers into other subjects.

Both group and individual incentives were effective. However, schools size was typically between 3 and 5 teachers, so probably too small to separate effects. Group incentives may not work in larger schools.

Interestingly there was no increase in teacher attendance. In interviews after the experiment teachers said they gave extra classes, and were more likely to have set and graded homework.

  • They tested the equality of observable characteristics across the control/treatment groups and could not reject the null that they were equal indicating that randomization was successful. Additionally, all schools (including control) were given the same information and monitoring, to ensure that differences in the treatment were not merely due to the Hawthorne effect.
  • There was no significant difference in attrition, and the average teacher turnover was the same across schools indicating that there was no sorting of teachers into the incentive schools.
  • They control for school and household characteristics which does not change the estimated value of delta, thus confirming the randomization.
  • A parallel study provided schools with money to purchase extra inputs, and the incentive levels were set such that they came to a similar amount of funds as the input schools. The input schools did see a positive effect, but to a much lesser degree. Additionally, the incentive programme actually ended up costing much less.


Interpretation Programme design is extremely important. In particular how the teachers feel about incentives may affect performance, and the size of schools may mean that benefits from group incentives are not seen due to the ability of teachers to freeride on the back of their colleagues.

Given that the study was compared with an input study in the same region and found improved results, it would seem that funding should be allocated to incentive schemes rather than input schemes. In addition, rather than raiding pay by 3% each year, that 3% could be allocated to the bonus scheme, and thus it would actually cost virtually nothing to run (other than the administering of the tests etc.). However, a mix of policies is probably a good idea, especially since the incentive scheme did not improve absence rates. As other literature has shown improving infrastructure etc. can lead teachers to be present more, so this could be one option for the input schemes.




Market failures associated with public goods mean that education/health etc. is generally provided by the state. Yet public sector workers need to be incentivized to do good work. There are lingering questions about how to prevent absenteeism, inspire effort etc. The danger is that without correctly aligning incentives investment in infrastructure may be useless. Essentially it is a principal-agent problem; the effort of government workers is only imperfectly observed by proxy. Incentives therefore need to be based upon what the agent cares about in order to align with the principal (i.e. getting money, or avoiding censure).


Missing in Action: Teacher and Health Worker Absence in Developing Countries N. Chaudhury, J. Hammer, M. Kremer, K. Muralidharan & F.H. Rogers (Journal of Economic Perspectives 2006)


In a Nutshell

This paper formulates the problem nicely. In a cross country survey they find 19% of teachers and 35% of health workers are absent, and these rates tend to be higher in poorer areas. Higher ranking workers are more likely to be absent. Absence is not strongly affected by wages, but is affected by physical infrastructure which indicates that they are unlikely to be fired for absence, but their decisions as to attend are affected by the physical conditions under which they work.  Additionally, the survey reported that absence rates are not driven by the same individuals always being absent indicating that this is not a case of bas apples, but a system wide problem.

There are certain structural issues with service provision in developing countries. Firstly the system is often highly centralized which does not allow for much local monitoring. Salaries are determined by seniority which leaves little scope for performance based pay. Wages are not typically responsive to local labour market conditions, and are compressed relative to the private sector. Disciplinary actions for absence are often missing. Additionally a variety of informal service providers have arisen and they are often operated by the same government officials i.e. teachers offering tutoring, and health workers having private practice.

Correlation analysis across countries gives some indication of what is driving absence. In particular, status and poor infrastructure seem to be correlated with higher absence. The literacy rate of parents is associated with lower absence (perhaps through better monitoring, demand etc.). Having been inspected recently leads to lower absence.  Higher salaries decrease absence.

This is suggestive of the following policy priorities: increase local control, improve civil service sector, upgrade facilities, performance related pay. Not all of these are politically viable as the civil service tends to be a well-organized interest group. Additionally, as the poor and those receiving services are a disparate group they may suffer collective action problems in achieving better provision.



Addressing Absence A. Banerjee & E. Duflo

In a Nutshell

This paper looks at evidence from randomized control trials that seek to provide incentives to service providers with a variety of mechanisms:

  1. External Control: external control is when someone who has no stake in the performance of the service being delivered has the job of monitoring performance and basing reward/punishment incentives on the monitored performance. This could be a direct measure (such as recording presence/absence) or a more indirect measure such as test scores. In Dulfo & Hanna treatment schools were given cameras and teachers had to take photos at beginning end of the day on which the date was imprinted. They were rewarded for being present more than 21 days in a month, and penalized if not. This resulted in increased teacher attendance – absence dropped from 36% to 18% in the treatment schools. This did not necessarily indicate that the teachers were actually teaching. The benefit of this programme is that monitoring was impersonal; there was no scope for head teachers etc. to cheat the system. In the long run however, in a non-experimental setting, even with impersonal monitoring such as this, head teachers have to be willing to apply the reward/punishment structure, which is not a given.
  2. Rewards for performance rather than presence: Prizes were awarded for good exam results. The treatment group saw an increase in results, but no effect on absenteeism. Rather teachers held more preparation sessions. This indicates that such programmes will not be effective to increase attendance, although they may be useful in conjunction with other measures as there was an effect on the outcome of interest.
  3. Beneficiary control over Service Providers: give greater control to the beneficiaries. This is based on the view that recipients should be at the centre of service provision. There needs to be a demand for the service, and a mechanism by which beneficiaries can really affect performance – this is rarely the case as they do not generally have the power to hire/fire nor set salaries. Experiments in this area have yielded disappointing results. An experiment that asked a local to monitor presence of a health worker did not improve attendance, and a school committees experiment had similarly lackluster results. It is suggested that in many settings beneficiaries are not actually upset about the state of service provision – they have low expectations and as a result have little desire to invest time and energy into making better services. [See paper below for rebuttal]. This indicates that increasing demand for quality service may be a key way to get better outcomes.
  4. Demand side interventions: An incentives to learn initiative in Africa whereby the best performing students were given a scholarship for the following two years increased presence of both teachers and pupils in the treatment group. Why there was an effect on teacher is not clear, it may have been that they were inspired by the increased attention of the students, they had higher status when one of their students got the scholarship, parents may have started to be more serious about education when there was a financial incentive to do so etc. Interestingly the effect was present also for boys who were not eligible to win the scholarships.

All of this suggests that some combination of programmes may be effective. Raising demand can be a good way to increase outcomes, and also to generate an environment in which local monitoring will be effective – a virtuous circle. Also, incentivizing teachers’ presence and performance might be a good way to increase attendance, and effort exerted in delivering the service.


Power to the People: Evidence from a Randomized Control Field Experiment on Community-Based Monitoring in Uganda M. Bjorkman and J. Svensson


In a Nutshell

Under the right conditions community based monitoring can be effective. Community based monitoring groups were set up to monitor local health providers. NGOs assisted in forming the groups and facilitating a discussion about what the people wanted from their health service, and drawing up a plan to improve them. This was then discussed with the health provider and a sort of contractual plan was drawn up. Under these circumstances they find a significant relationship between the degree of community monitoring and health utilization and health outcomes. The reason they theorize they found results where others have failed (see above) is that there is a lack of relevant information that prevents benefits from general community monitoring. Thus, as the community group was given access to a large amount of information, including local health data outcomes, and information about what might be expected from health providers, they were better able to come to an agreement on what services should look like, and hence more able to effectively monitor. In sum, a lack of information and failure to agree on expectations of what it is reasonable to demand from a provider was holding back individual and group based enforcement.

Treatment times fell, child mortality fell nearly 50%, and people were more likely to use the health facilities.

This paper effectively increases demand for better service provision, and also provides a mechanism for a community to achieve that level of service.



S. Jayachandran (2008)

Principal Research Question and Key Result Does the ability of teachers to offer paid tuition outside of the school alter their incentives to deliver the in school service? The results of the analysis indicate that tutoring has a negative effect on test scores which suggests that being able to offer tutoring gives perverse incentives for teachers during the school day.  
Theory There are two theoretical links from tutoring to achievement. Student achievement is a function s(m, t) i.e. a function of material taught in school (m) and tutoring):

  1. Tutoring and School are substitutes then δ2s/(δm/δt) = smt < 0: This just states that the value of tutoring increases when less material is taught during the school day. This implies that a teacher can raise demand for tutoring by decreasing the amount of teaching during the school day.
  2. Tutoring and School are compliments then smt>0: This would  hold if there was some threshold level of achievement that students were trying to reach, and for students just shy of the threshold they could benefit from tutoring. This would incentivize the teacher to teach more material, such that there were more students who were able to get close to the threshold level.

The utility to the teacher depends both on his profit, and on the costs of raising/lowering the amount taught. This implies that a tradeoff between the costs of changing m, and the benefits of higher profits induced by changing m.

Given that the results indicate that tutoring and schooling are substitutes, this implies that policies to restrict the provision of t, may increase the amount of m, and also that increasing the costs of lowering m (for example through stricter supervision) could also be welfare creating.

Another possibility is to increase the number of third party tutors. If such tutors offer a higher valued service (for example through smaller tutoring groups), then the teacher will be incentivized to teach  more during the school day, as if less is taught then some students will be diverted to the higher quality third party tutor. The increased competition will reduce the cost of tutors, and more people will take up tutoring, and everyone enjoys the benefits of being taught more in the school day. The reason this holds is that there is less incentive to manipulate m when only some of the students induced to then purchase tutoring will do so from them.


Motivation In the developing world many students attend outside tutoring sessions and it is common for the student’s own teacher to also serve as the tutor. This is not common in the developed world. This could be because there is a lower opportunity cost of time due to income effects in the developing world. It could be that there is smaller supply of educated non-teachers who can serve as tutors. Also, less effective means of monitoring teachers by supervisors and parents may increase the ability to rent seek by teachers thus increasing their interest in providing tutoring – this might incentivize teachers to avoid teaching the curriculum in schools in order to generate demand for their fee-generating tutoring classes. If this is the case then all students are made worse off (by less formal education), but those who are hit the most are those who are unable to afford (or otherwise do not demand) tutoring. As such, rather than making the education sector more efficient (by improving access to education for weaker students/those who demand more education) it may actually create inefficiencies. In this case banning teachers from tutoring, or reducing the barriers of entry for third party tutors could be welfare increasing for all students even for those who do not take up outside tutoring. 
Experiment/ Data Data are from a large nationwide survey of students, schools, teachers and families conducted in Nepal. There are 3850 public schools and 890 private schools in Nepal. Students who have completed year 10 and taken the national exam for which the results are recorded are the focus. A random sample of schools is chosen. There are demographic details of the schools, as well as data on whether the student took tutoring, and subjective measures of the quality of school teaching. 

ExamScoreijk = βOffersjk + θTakesijk + λi + ρk + εijk                (1) 

This is a fixed effects  model. i is individual, j is school and k is subject. Offers is a dummy that equals 1 if the school offers tutoring in that subject. Thus it is identified by comparing subjects within a school. Making the estimation within school reduces endogeneity concerns such as schools with more resources/brighter students providing more/less tutoring.

To test the notion that there is negative selection into Offers, a regression with Offers as the dependent variable is regressed on σ(PriorExamScore)ijk . If sigma is negative this implies that selection into offering tutoring is negative, as passing the exam is associated with a reduction in offering of tutoring.                                               (2).

A DID estimation is estimated

ExamScoreijk = βOffersjk + θTakesijk + τ(Public * Offers) + λi + ρk + εijk                  (3) 

Where tau is the differential effect that offering tutoring has in a public school (as opposed to a private school). The assumption is that the unobservable elements that encourage selection into OFFERS are the same across private/public schools. An interaction between Public*Takes is also included.


Results The results from (1) are negative for Offers, but not really significant. Takes is negative and significant. This indicates that worse students may be selecting into tutoring classes. Whilst some endogeneity is removed by looking within school, it is still possible that whether the school offers tutoring in a specific subject is driven by individual student/teacher ability in that particular subject. Thus, the negative coefficient on Offers could just be reflecting the negative spillovers from the school having to offer tutoring in the first place (due to low quality students).This is partially rebuffed by the results of the Offers regression (2) which shows no relationship between offering tutoring, and past achievement, although this is analyzed on a non-random subsample of the data.

The results of (3) are that tau is negative and significant indicating that when tutoring is offered students in public schools are differentially more likely to fail the exam (presumably as private school teachers are less able to vary the amount of material that is taught in the school day due to better monitoring/financial incentive). The Takes*Public interaction is positive, indicating that selection is negative (I don’t get this bit).  The effect is larger when the sample is restricted to small towns as the school more likely to behave like a monopolist with control over both schooling and tutoring. In urban areas there is likely to be more competition.

Using whether the teacher completed the in school curriculum as the dependent variable, it is shown that the coefficient on OFFERS is negative, indicating that offering tutoring may be incentivizing teachers to teach less.

  • Uses different samples.
  • Test alternative hypotheses: could be mechanical fatigue, but the relationship between teacher effort and offers is only marginally significant and negative.


  • If preventing teachers from tutoring decreases wages in the educational sector sufficiently, this may have the effect of dissuading talented teachers from entering the profession, and in the long run this could damage the education sector and be welfare reducing for all students.
  • The subjective measures were based on post exam reflections which could indicate recall bias/be affected by personal feelings toward the teacher.
  • The DID estimation may estimate the differential effect of tutoring in public schools but it cannot speak to the direction of causation. It is still possible that results are driven by negative spillovers from tutoring provision i.e. negative selection.


Implications One reason for poor educational outcomes in developing countries could be that teachers lack strong performance incentives. This could indicated that a partial ban on teacher’s tutoring or encouraging third party entrants could be welfare improving, although this will depend on how people sort into those professions. Additionally there may be political constraints that prevent this course of action, as civil service teachers will tend to be well unionized, and a politically visible component of society. That private schools perform better could be an indication that performance pay, or increased monitoring by parents due to a financial stake in the education provided could be useful for increasing test scores.The results could have implications for other sectors. In particular, health workers with a private practice on the side may be facing very similar incentive structures. In actual fact, the incentives may be even stronger, as only one patient observes the outcome of their effort, whereas in a school, potentially many student/parents etc. observe the outcome. This could mean that the costs of varying m for health workers is much lower (as detection is harder) and hence they are more likely to do so in order to increase extractable rents.



What is the cost of Formality? Experimentally estimating the demand for formalization, S del Mar, D. McKenzie & C. Woodruff (Journal of Economic Literature)

In a Nutshell

One of the major constraints on the ability of developing nations to raise tax revenues is the large component of the economy that is informal and hence outside of the tax system. There are two broad theories as to why firms are informal. Firstly that associated with De Soto claim that firms are informal because of over burdensome entry regulations to being formal. The Second argues that entrepreneurs weight the costs (registration costs, taxation costs) and benefits (access to banks, courts, other public goods) of formality and make a rational choice. This implies that as firms grow, they will be more likely to benefit from formal institutions and hence they will be more likely to become formal. The question is important as from the perspective of the government; they want to encourage formality in order to increase tax receipts, and so understanding what encourages formality is key.

In order to provide evidence for the debate the authors conduct an RCT in Sri Lanka, whereby one group of firms was given information about the benefits of formality and an offer of a refund of the registration fee. Then three other groups were given the same information, and then a progressively large reward for registration within one month. 

They found no effect of the information only treatment relative to the control, and progressively more firms registering as the reward increased. By comparing firms in two regions where registration differs by ease (in terms of time) and cost, they are able to state that there was more formalization where the process was easy, but that the difference in costs played no part. In the absence of a monetary incentive to formalize, firms chose to remain informal. This is most consistent with the rational choice model of informality rather than the De Soto view. Additional evidence for this is that larger firms (for whom formality would be the most expensive) were also much less likely to have formalized at all experimental incentive levels. The fact that more firms formalize when the reward increases indicates that some financial gain is needed in order to offset the cost of being formal.

Whilst this is not an experiment that naturally lends itself to policy recommendations it can be said that in order to increase informality the following steps might be taken:

  • Reduce the time burden of registering, and streamline the process
  • Reduce the costs of formality – as the article below makes clear, corporation tax is high in the developing world, making formality disproportionately expensive.
  • Improve land rights – in the follow up study the researchers asked why certain firms had not registered even though they wanted to, and it was stated that they did not have land rights to their place of business (as it was on public/church land etc.), and this meant that they were unable to register the business.



Tax Structures in Developing Countries: Many puzzles and a possible explanation, R. Gordon & W. Li (Journal of Public Economics)

In a Nutshell

Tax structures in the developing world are systematically different to those in the developed world. They collect less overall revenue, less income tax, they rely more on corporation tax as well as consumption and production taxes. They also have higher tariff levels and inflation is higher (tax on savings). In other words they tend to have higher tax levels in the most distortionary of taxable areas.

There are several plausible reasons for this:

  • People in developing countries do not value public goods in the same way as in the developed world and so taxes are of lesser importance – this hardly seems likely given the public infrastructure needs in these regions.
  • They have different public attitudes to redistribution – this is possible although unlikely (and uninteresting)
  • They face constraints on their ability to collect taxes effectively – this is the best candidate.

The mechanism theorized in this paper is that they face constraints due to the large informal economy. The government relies on access to bank information in order to correctly tax activity. Firms are thus only subject to taxes when they choose to make use of the financial sector. When taxes are high enough, many firms will opt for informality. This mechanism has little effect in the developed world where the benefits from using the financial system are high, but may of great importance in the developing world where underdeveloped financial systems give little benefit to the entrepreneur, and hence provide no balance to the income that will be lost through the subsequent taxation that occurs pursuant to use of the financial system.

This explains why there are differential VAT rates on firms that find it difficult to be informal (as they have greater tolerance of tax), why tariffs are often used (as they can be easily identified whereas VAT payments on the finished product are easily obscurable), and why consumption taxes are high.

The key take away is that it is important to understand why tax systems are so different in the developing world, before making concrete recommendations about how to reform them.


Income Inequality and Progressive Income Taxation in China and India, 1986-2015, T. Piketty and N. Qian (American Economic Journal 2009)

In a Nutshell

Income taxation can increase revenues and are less distortionary and regressive than taxes on consumption and production. In China, whilst virtually no one was subject to income tax due to high exemption level, as income per capita has risen, the exemption levels have remained fairly constant, which has meant that many more people have been subjected to income tax as the nation’s wealth grows. This has seen an increase in taxable population from 0.1% to around 20% and this has meant that income tax now accounts for around 2.5% GDP.

The situation in India is very different. Due to the frequent updating of the exemption rates, the taxable population has stagnated around the 2-3% level, and income tax represents a tiny 0.5%GDP.  However, one driver of this may be that the proportion of formal wage earners in India is very low (so not moving the exemption rates would mean increasing penalization of a relatively small group of formal wage earners).

Moving from an elite income tax rate to a more broad based and progressive system is exactly the type of fiscal modernization process followed by Western countries in the early 20th century. This implies that developmental assistance could be directed at improving fiscal systems.