Notes from Lecture and Various Papers 

Instrumental Variables

Instrumental variables are used when OLS estimates are biased by endogeneity or measurement error. The process is based upon identifying exogenous variation in the key independent variable.


I’m not going to go into how the IV estimator is constructed as it is well documented in EC406 notes, or see e.g. Stock and Watson.


If the regression is overspecified (i.e. there are more instruments than endogenous regressors) then a Hansen-Sargan test can be used to test the exclusion restriction – although the instruments will pass the test if they are all equally endogenous i.e. it is a weak test. In general the F-stat should be > 10 in the first stage, and there should be a strong theoretical reasoning behind the instrument (such that the “compliers” are meaningfully identified).


In the spatial context spatially lagged X variables have been seen to be used as instruments for the spatial lag of Y. However, as we have already seen this method is not without its complications (correctly specifying the functional form/ exogeneity restriction violated). Thus, the literature has begun to move toward adopting the quasi-experimental method by searching for instruments based on policy changes, boundaries, geological features etc. or other similar type events.


Some examples


Hoxby, Does Competition among Public Schools Benefit Students and Taxpayers? (2000)

This is surely one of the most famous examples of a spatial IV. The paper examines whether increased school competition in the form of a greater number of school districts within a municipality has benefits for the population studied. OLS estimates are biased because the supply of school districts is in part a response to the demand for school districts which is probably driven by wealth, ability, parental involvement, and other unobservable characteristics which codetermine student outcomes and cannot be readily controlled for. Thus, Hoxby uses an instrument to attempt isolate exogenous variation in the supply of school districts, to get a consistent estimate of the effect competition has on student outcomes.  The instrument is based on the number of streams and rivers within a municipality. The logic is that in the 19th century when school districts were being drawn up, geological features such as streams presented barriers to movement such that districts were often drawn up with the streams forming natural boundaries. Thus, a municipality with more streams would have more school districts, hence the instrument is relevant. Over time, the importance of streams in terms of determining outcomes has diminished, and hence the presence of more streams has no effect on educational outcomes other than through its effect on determining school districts in the 19th century, and hence the exclusion restriction is satisfied.


There are problems with the strategy. Specifically, rivers may still have an economic effect today, and this could feed back into educational outcomes. Additionally, the way the instrument was constructed has been criticized, as it was subject to much subjective judgement.


Luechinger, Valuing Air Quality Using the Life Satisfaction Approach (2009)

This paper is trying to gauge how important air quality is for affected populations. The hedonic method of valuation (which seeks to determine the unobserved price of a public good by using prices embedded in private goods) tends to underestimate the value of air quality as migration is costly, and private goods prices are based on perceived rather than objective risk. Any residual effect that air pollution has on life satisfaction is indication that compensation has not been fully capitalized in house prices for the reasons just stated.


However, an OLS estimates of air quality on life satisfaction would be biased as cleaner air is the product not only of exogenous policy change (even assuming it is exogenous), but also of local industrial decline and economic downturn. These simultaneous developments can have a countervailing effect on life satisfaction and housing rents. Thus he uses an instrument for SO2 levels; the mandated installation of scrubbers at power plants


The construction of the instrument is somewhat convoluted as it is relies upon a difference in difference estimation. Desulphurization pursuant to retroactive fitting of scrubbers at power plants is the treatment, with a county being down or upwind of the power plant determining assignment to treatment and control group respectively. Yet, as being in treatment/control is a question of degree rather than kind, the treatment group variable is a frequency measure of how often in the period of study the county in question is downwind of the plant. This is likewise multiplied by a distance decay function and the pre-desulphurization levels of the plant in question is controlled for.


The main finding is that SO2 concentration does negatively affect life satisfaction, with estimates being much larger for the OLS specification indicating that reductions in sulphur levels are indeed accompanied by factors that have a countervailing effect on satisfaction.


Gibbons et al. Choice, Competition and Pupil Achievement (2008)

This paper uses a boundary discontinuity in order to construct an instrument for primary school competition in the UK which gets around the endogeneity concern in OLS estimates, namely that motivated parents may move closer to popular schools. The boundaries in question are the Local Education Authority boundaries. Whilst families are allowed to make application to schools outside of their LEA, cross-LEA attendance is extremely uncommon.


They construct indices for choice: for each school they define a travel zone school  that a) encompasses all residential addresses within the same LEA and b) that are contained within a circle whose radius is the median of the travel to school distance for the pupils at that school. Pupil choice is thus the number of travel to school zones in which the student lives, and the School competition measures is the average of this value for students actually attending a given school (i.e. the number of alternatives available to student of a particular school). If families sort spatially near to high performing schools this will tend to decrease apparent competitiveness.


They then exploit the fact that families living near boundaries face longer journeys to school than those in the interior, and as such they are more likely to attend their local school. This is because the catchment area is bounded and hence shrinks. Thus the distance between a pupil’s home and the LEA boundary is an instrument for school choice, and the distance between a school and the boundary is an instrument for competitiveness. They do not find evidence that school competition increases pupil achievement.


Differencing Methods

Often there will be spatial sorting an heterogeneity i.e. differences between places that lead to biased estimates. This sorting will often be on observable characteristics, but just as frequently on unobserved characteristics.


One method for dealing with this is the fixed effects model. This can be estimated with panel or cross sectional data using area dummies, or by making the within groups transformation (de-meaning) and then estimating with OLS. This removes the area specific time invariant determinants of the dependent variable.


With panel data can be time differences which has the same effect. Time dummies can also be included to strip out variation common across regions due to time trends. The remaining variation is time variant region specific variation, and as such for the estimates to be unbiased there can be no correlation between region specific time variant shocks and the error term. For example, there could be no sudden shock to the educational system in a given area that induced people to sort spatially into that area.


The difference in difference method is usually applied to evaluating policy interventions where a treatment and control can be created. I am not going to go into the mechanics here as it is well documented elsewhere.


Some Examples


Manchin et al. Resources and Standards in Urban Schools (2007)

The paper is concerned with whether additional resources can be used to improve the outcomes of hard to reach pupils specifically evaluating Excellence in Cities programme that gave extra funding to schools based upon their level of disadvantage as measured by the proportion of pupils eligible for a free lunch. They use a DID strategy comparing the outcomes in EiC schools with a comparison group. A direct comparison between EiC and non-EiC schools would not be valid as there is no reason to assume that the parallel trends assumption holds. Mindful of this the authors using propensity scores based on a host of school and pupil level characteristics to create a subset of non-EiC schools which are statistically similar to the pre-treatment EiC as schools, and they use this subset as the control group. They do not make a hugely convincing argument for this method, and indeed there are statistically significant differences in the outcome measures in the pre-treatment periods, indicating that there is only limited reason to suspect that the key identifying assumption holds.


They find that the policy was effective in raising pupil attainment in the treatment schools but that the benefits were restricted to the students best able to take advantage of the policy (i.e. the most gifted).


Duranton et al. Assessing the Effects of Local Taxation Using Microgeographic Data (2011)

This in an interesting paper that seeks to identify the effect of local property taxation on the growth of firms. Estimating this has been difficult as site characteristics are heterogeneous, and many characteristics will be correlated with unobservable determinants. Secondly, firms are heterogeneous, and this differences are often largely unobservable, yet these differences cause them to sort spatially. Lastly tax systems may be endogenous to location decisions of firms.


Using panel data they estimate a model which includes firm specific observable characteristics which removes firm specific time varying observable variation. They include a firm fixed effect to remove the time invariant firm specific unobservable variation. They also include higher level fixed effects (site, and region). They then difference the data in the usual way which implements the fixed effect strategy as noted above.


They then do a spatial difference. This takes the difference which is the difference (in the difference) between each establishment and any other establishment located as a distance less than d from that establishment. If there is a term αzt for each site z in time t, and this is not controlled for, then any local shock to firms that also affects tax rates will bias the panel estimates above. However, if we are able to assume that for small changes in d, Δαzt ≈ 0 (i.e. local shocks are smooth over small amounts of space), then by spatially differencing the alpha term falls away, and the time varying local shocks are effectively controlled for.


They then combine this with an instrumentation strategy that instruments tax rates using political variables.  



  1. Melinda Kyle
    Posted August 13, 2012 at 3:02 pm | Permalink | Reply

    Hi ,

    I can’t seem to be able to find your contact form/email.
    Would you be interested in a guest post for you blog? If you’re interested get back to me at my email.


    • majorgressingham
      Posted August 13, 2012 at 5:57 pm | Permalink | Reply

      are you a real person?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: