MOSTLY POINTLESS SPATIAL ECONOMETRICS?

S. Gibbons & H. Overman

A Summary with some additions from the lecture and Under the Hood Issues in the Specification and Interpretation of Spatial Regression Models, L. Anselin, Agricultural Economics 27 (2002)

**Spatial Models and Their Motivation**

The inclusion of spatial effects is typically motivated on theoretical grounds that there is some spatial/social interaction that is affecting economic outcomes. Evidence of this will be spatial interdependence. Thus models are created that seek to answer how interaction between agents can lead to emergent collective behaviour and aggregate patterns. These might also be termed neighbourhood effects.

To start with a basic linear regression:

*y*_{i} = **x**’*β + **µ*_{i}

Where *x* is a vector of explanatory variables and *β* is a vector of parameters with *µ *as ever being the error term. This basic format assumes that each observation is independent of the others. This is generally too strong when in a spatial context as events in one place often affect events in another, particularly if they happen to be close to each other. A simple way of capturing the effects that nearby observations have on each other is to define a weights vector *w* which reflects how observations affect each other (for example distance weighting etc.). If this weighting system is multiplied by *y** *then we have a matrix *w’*_{i}y which for observation *i* is the linear combination of all *y* with which it is connected. If the weights are summed to 1 then this will give a weighted average of the neighbours of *i*.

**Spatial Autoregressive Model**

This weighted average can then be used to construct the spatial autoregressive model (SAR) which is also known as the “spatial *y”* model, and is referred to as a spatial lag. This model attempts to uncover the spatial reaction function, or spillover effect. The model looks like this:

*yi =ρ***w’**_{i}y + x_{i}β +µ_{i}

The idea is that an individual observation is affected both by their own characteristics and recent outcomes of other nearby agents who are capable of influencing his behaviour. One example may be that when determining at what price to sell one’s house, the individual characteristics such as number of bedrooms are taken into account as well as property prices achieved by others in the vicinity. In this case the Beta captures the effects of the individual characteristics and the Rho captures the causal effect of neighbourhood actions.

**Spatial X Model**

Alternatively we may drop the assumption that *y*_{i}_{ }is affected by neighbouring *y* outcomes, and instead assume that it is affected by spatial lags of the observable characteristics. This is then a spatial *x* model (SLX):

*yi = ***x’**_{i}β + w’Xγ + µ_{i}

This assumes that the observable neighbourhood characteristics are determinants of *y*_{i}. As in the above example this could be the characteristics of neighbourhood housing such as appearance, size etc. influencing individual price decisions. Beta is as above, and Gamma is the causal effect of neighbourhood characteristics.

**Spatial Durbin Model**

The spatial Durbin model (SD) combines SAR and SLX:

*Yi = ρ***w’**_{i}y + x’_{i}β + w’Xγ + µ_{i}

Interpretation is as above indicates.

**Spatial Error Model**

This model drops the assumption that outcomes are explained by lags of explanatory variables, and instead assumes that a type of SAR autocorrelation in the error process. This yields:

*y*_{i} = **x**’*β + **µ*_{i} ; µ_{i } = ρ**w’**_{i}µ + v_{i}

This model assumes that outcomes are dependent upon the unobservable characteristics of the neighbours.

**Problems**

OLS with a lagged *y* variable (SD and SAR) yields inconsistent estimates unless Rho equals 0. This is because *w’*_{i}y is correlated with the error term. [need help here] The gist of it seems to be that the average neighbouring dependent variable includes the neighbour’s error term, the neighbour’s neighbour’s error term etc., such that any observation *i *depends to some extent on the error terms of all the other observations. [I assume this would not be the case if the weighting restrictions were set to only include the nearest neighbor]. The intuition behind this problem is that you are your neighbour’s neighbour. In the simple *i-j* case the following occurs:

*y*_{i} = ρy_{j }+ x_{i}β + ε_{i }(1)

*yj = ρy*_{i} + x_{j}β + ε_{j}(2)

Substituting (2) into (1) we get:

*y*_{i} = ρ(ρy_{i} + x_{j}β + ε_{j}) + x_{i}β + ε_{i}

Which shows that yi is dependent in part upon itself.

Using OLS for the SLX is also problematic, as the assumption underlying OLS is that the error term is not correlated with the regressors. For the SLX model this means that E(ε| *x*) = 0 and E(ε| **W***x*) = 0. However if there is spatial sorting for example when motivated parents locate themselves near to good schools, then this assumption is violated as E(ε| **W***x*) ≠ 0.

The SE model may generate consistent estimates as the assumption that the error is not correlated with the regressor holds, however, standard errors will be inconsistent as by definition the model has autocorrelated error terms. This can lead to mistaken inferences.

Standard errors are inconsistently estimated for all models.

Additionally, the different types of model are difficult to distinguish without assuming prior knowledge of the data generating process which in practice we do not have.

**Maximum Likelihood**

These problems can be got around using Maximum Likelihood estimation which will provide consistent estimators. Essentially this is the probability of observing the data *y* given a value for the parameters Rho and Beta. A computer uses iterative numerical maximization techniques to find the parameter values that maximize the likelihood function. [I am totally unclear on how this works, however, I am assuming that we do not need to know the ins and outs.]

The issue with this specification is that it assumes that the spatial econometric model estimated is the true data generating process. This is an incredibly strong assumption that is unlikely to hold in any circumstance.

**Instrumental Variables**

In theory a second order spatial lag *w*^{2}’x_{i} (or even third, fourth order) can be used as instruments for *w’y*_{i} and then this “exogenous” variation in the neighbourhood outcome can be used to determine *yi* under the assumption that the instruments are correlated with *Wy* but not directly with *yi*. The first stage would look like this:

*wy = w’xβ + **ρ***w**^{2}xβ + ρ^{2}**w**^{3}xβ…* *

and then the predicted values of **wy** would be used in the second stage regression with *yi* as the dependent variable.

There are problems also with this technique. Firstly it is unlikely that the true nature of *w* is known, and that it is correctly specified is crucial to the model. For example X variables may have an effect over a 5km distance, but the weighting system incorrectly restricts analysis to 2km. Secondly the higher order lags of the X variables could still be having an effect upon *yi* and hence the exogeneity restriction is violated, and the 2SLS results are biased. Lastly the different spatial lags are likely to be highly correlated, and as such there will be little independent variation which is essentially a weak instruments problem. Weak instruments can severely bias second stage coefficients which will additionally be measured imprecisely.

**The Way Forward**

- Panel data can allow for differencing over time to control for fixed effects. But the problems will be the same as above, but only in the context of differenced data.
- In terms of the IV strategies, genuinely exogenous instruments should be found such as changes to institutional rules [see later tax paper summary].
- They argue that the SAR model should be dropped, and if neighbourhood effects cannot be identified using genuine instruments, a reduced form of the SLX model should be used.
- Natural experimental techniques from other economic literatures should also be borrowed e.g. DID, Matching. These techniques may help us to find causal effects but the tradeoff is that they are only relevant to some sub-set of the population (as in the Local Average Treatment Effect for IVs).