/sidenote-on-ml-accuracies
3rd May, 2021Gareth Simons

A sidenote on machine-learning accuracies

Predictions with deep neural networks and rich datasets are almost trivial if working with high-quality data and if some form of relationship can be recovered from the given variables. However, certain points are worth considering when working with predictive accuracies, which can easily be misconstrued if not viewed within context. Firstly, it is easy to claim accuracies that would inflate the relevancy of models by using larger distance thresholds. Due to the Modifiable Areal Unit Problem, correlations and predictive accuracies naturally rise at larger distance thresholds but actually tell us less about conditions specific to a local scenario. The challenge is therefore to recover as much accuracy as possible while working at as small a distance threshold as feasible. Secondly, validation and test sets have to be partitioned on a spatial basis rather than a purely randomised selection of points; for example, by using a grid to delineate a city and then setting aside all points within grid cells at a specified interval for the test set. This prevents the model from overfitting by siphoning-off information between adjacent points, which would otherwise inflate test-set accuracies. Visualisation can be a powerful tool for finding hints of overfitting within a spatial context. Thirdly, whereas straight-forward prediction of some-or-another variable with the use of deep neural networks is interesting and useful in its own right, it can be even more useful to identify locations where the observed intensities diverge from predicted intensities. This can trigger observations in the spirit of Jane Jacobs’ recommendation to look for clues in ‘unaverages’1: the local trends or oddities that otherwise seem to defy the model and normative patterns. These peculiarities offer glimpses into localised factors that may otherwise affect the expression of the real-life observations, and offers a guidepost regarding other considerations that may be beneficial if added to the model, or else fodder for speculation and discussion around topics that in many cases will remain beyond the realm of the model’s predictive power.

Observed, Predicted, and Differenced intensity of eating establishments at 400m walking tolerances.Observed, Predicted, and Differenced intensity of eating establishments at 400m walking tolerances.

By way of example, the above shows the observed, predicted, and differenced number of local eating establishments for Greater London using multi-scalar network centralities and population densities as input variables. The differenced plot shows that eating establishments around historical high street locations are slightly underpredicted: it could be surmised that this is due to a lack of information about historic village centres and the related availability of commercial building stock that may otherwise distinguish certain areas of higher betweenness and closeness centralities from others. Another example: areas such as Soho, Seven Dials, and Angel are over-predicted: here it could be theorised that there may be a latent demand for additional locations, currently unsatisfied due to spatial constraints on the number of viable locations or due to being crowded out by other landuses such as retail.


  1. 1. Jacobs J. The Death and Life of Great American Cities. Vintage Bo. New York: Random House; 1961.