1st May, 2021Gareth Simons

Prediction of ‘artificial’ urban archetypes through a synthesis of domain expertise with machine learning methods


The vitality of urban spaces has been steadily undermined by the pervasive adoption of car-centric forms of urban development characterised by lower densities, street networks offering poor connectivity for pedestrians, and a lack of accessible landuses; yet, even if these issues have been well understood for some time, the problem persists in new forms of planning. It is here posited that a synthesis of domain knowledge and machine learning methods may allow for the creation of robust toolsets against which newly proposed developments can be benchmarked in a more rigorous and scalable manner in the interest of greater accountability and better evidenced decision-making.

A worked example is provided showing how machine learning models capable of distinguishing ‘artificial’ towns from the more walkable and mixed use ‘historical’ equivalents can be developed from a large-scale urban morphological dataset for 931 towns and cities in Great Britain. The data is developed from a variety of centrality, landuse accessibility, mixed-use, and population density measures which have been computed for pedestrian walking thresholds ranging from 100m to 1600m. The spatial aggregations and morphological measures are computed at a 20m network resolution using the cityseer-apiPython package which employs a local windowing-methodology with distances computed directly over the network and with aggregations performed dynamically and with respect to the direction of approach, thus preserving the relationships between the variables and retaining contextual precision.

Using officially designated ‘New Towns’ as a departure point, a series of clues is developed: through a process of iterative feedback, a supervised classifier (Extra-Trees) is cultivated from which 185 ‘artificial’ locations are identified based on data aggregated to respective town or city boundaries; this information is then used to train supervised and semi-supervised (M2) deep neural network classifiers against the full resolution dataset where locations are assessed at a 20m network resolution using only pedestrian-scale information available to each point of analysis. The models broadly align with intuitions expressed by urbanists and show strong potential for further development through the use of curated exemplar locations fed to semi-supervised machine learning methods.

Link to preprint paper pending.

Introduction: Prediction of urban archetypes with deep neural networks

In 2012, Krizhevsky, Sutskever, and Hinton1 introduced a revolutionary machine learning model: it combined massive datasets, convolutional layers, and deep-learning to attain best-in-class classification accuracies on the ImageNet database, consisting of more than 15 million images in over 22,000 categories. The ideas and techniques were not necessarily new, but the authors noted that the depth of their neural network, in this case five convolutional layers and three fully connected layers, was pivotal to the performance of the model. The subsequent upsurge in the use of of ever-larger datasets combined with ever-deeper neural networks has led to machine learning models with human — or better than human — levels of performance. Deep learning has attained a near-mythical status and is a prevalent feature in the rapid development of AI, with2 going on to claim that AlphaGo, an AI underpinned by deep learning, had learned ‘superhuman proficiency’ in the game of Go from scratch without the aid of human knowledge. Yet, on closer scrutiny, claims that such systems are truly capable of developing human-like intelligence, especially from ‘tabula-rasa’, tend to be overstated and it is likely that further breakthroughs will be required before truly generalisable artificial intelligence can become a reality3. Marcus4 argues that AlphaGo’s intelligence is not developed through truly ‘innate’ processes: human knowledge has entered the system in the form of a Monte Carlo tree search algorithm, thus empowering the system with techniques necessary to learn solutions specific to the challenge at hand. Further, the model’s intelligence is not generalisable, it has to be retrained to learn other games and lacks the ability to solve broader classes of problems, some of which are trivially solved by humans. The tremendous volumes of data and the great difficulty in generalising deep neural nets to other problems draws sharp contrasts to the human mind5 which is equipped with innate structures appearing to facilitate an ability to form rapid and powerful abstractions that generalise well to varied forms of problem-solving. This underscores an important and oft understated reality: deep learning is a tremendously powerful, but also fickle, tool6. It is brittle by nature: data hungry, narrowly focused, and easily fooled. Neural networks may learn patterns, but can’t ‘see the forest for the trees’. If representative patterns are not present in the data or go undetected by a model’s structure or loss-function then the model ‘does not know what it does not know’. Such models may consequently behave contrary to best intentions by being needlessly complex7, biased or ignorant of unrepresented or unfairly represented classes within the data7|8, and are generally difficult to develop or reproduce9.

The proverbial notion that machine learning is a self-sufficient technology that can magically conjure meaning out of meaningless jumbles of data and that deep-learning infused AI and robotics technologies will soon usher-in a utopian future must therefore be taken with a degree of cynicism. However, it is also important to note that nascent machine learning methods remain amongst the most powerful and useful tools currently at the disposal of the scientific community, and that many of the perceived shortcomings may be attributed to a disconnect between the hype associated with the models and an otherwise more realistic understanding of their nature and limitations. The contributions of humans to model development tends to be understated4: for these models to be meaningful and trustworthy they require large amounts of domain specific information imparted at various stages of the development process. In this sense, ML can be likened to a powerful sidekick, but one that is potentially prone to naive assumptions or misbehaviour if left entirely to its own devices. The models require interaction and oversight in a process that can be likened to a ‘dance with data’: datasets have to be selected and prepared in a manner that accurately represents the nature of the data that we want the algorithms to learn; targets and loss functions are chosen to coerce models in the right direction; and regularisation methods and testing procedures are necessary to ensure that models are capable of generalisation to unseen samples in a manner that is realistic and fair for the task at hand.

Urban scientists consequently need to be aware of how datasets, data science methods, and machine learning models may ultimately affect day-to-day decisions and policies10, and how that misinformed models may end-up being used to justify courses of action affecting city-citizens and the urban environment for the worse. The danger of chasing misguided accuracy metrics or ‘buzz-friendly’ marketing pitches must therefore be emphasised: models can be accurate, but meaningless. A simple and not uncommonly encountered example is the application of simple error or accuracy rates to unbalanced datasets. Class imbalances are regularly faced by real-world data analysis situations when labels for one class substantially overpower the presence of another. Credit card fraud data provides an extreme example, where the minority class (fraudulent transactions) may be infinitesimally smaller than the majority class. When a classifier is trained and gauged against an unbalanced dataset using simple accuracy rates, the algorithm may simply opt to completely ignore the minority class (e.g. inferring that all credit card transactions are not fraudulent) while claiming an accuracy approaching 100%. Various strategies exist for the temperance of class imbalance problems, including undersampling the majority class, oversampling the minority class, adjusting the costs associated with losses from respective classes11; use of more nuanced accuracy metrics such as Receiver Operating Characteristic curves (the true positive rate plotted against the false positive rate) or F1 scores (weighted average of precision and recall)12; and calibration techniques for correcting the distributions of probabilistic classifications13. Yet, the application of such techniques requires intervention through the role of an informed data scientist who, in turn, needs to be aware of the potential presence of such imbalances and how that overlooking these may have far-reaching ramifications. This example reflects the broader issue: the development of predictive machine learning models may require a substantial degree of nurturing, testing, and oversight to understand how the model ‘thinks’ and ‘reacts’ to the data and to guard against unintended forms of behaviour. Visualisation methods can be important for this purpose because they facilitate comprehension of how the models are working while allowing domain experts, who may not have direct knowledge of the development of such models, an opportunity to provide feedback on suspicious forms of predictive behaviour.

Whereas data science methods can be used towards any variety of problematic workflows or end-purposes, they also hold the potential for scalable and rigorous forms of sensible analysis if used with sufficient safeguards and rigorous oversight from those with detailed knowledge of the domain of interest. Contrarily, it bears emphasis that a litany architects, urban designers, planners, engineers, civic officials, and NIMBYs have, in turn, been directly responsible for a trail of ill-conceived urban interventions, and this can’t be blamed on statistics or models so much as a human proclivity towards reductionism and self-interest. Although humans are better than machines at generalising problems, they can also be susceptible to wistful narratives and are easily waylaid by idealistic pursuits or profit-driven motives. Further, even where skilled and perceptive urban designers and planners are well-aware of implicit biases underpinning problematic planning proposals, they may be at a loss to bolster better informed decision-making against hearsay or political pressures. It is against this backdrop that an interesting question can be posed: Can we connect the strong-suits of domain experts, who may intuitively understand the issues at hand, to the strong-suits of algorithms, capable of exhaustively exploring and laying-bare the solution space in a robust and scalable manner? How might tools that synthesise qualitative knowledge with quantitative approaches be used to build and accountable evidence-base within the context of politically wrangled decision-making processes?

Link to preprint paper pending.

  1. 1. Krizhevsky A, Sutskever I, Hinton GE. ImageNet Classification with Deep Convolutional Neural Networks [Internet]. 2012. Available from: http://code.google.com/p/cuda-convnet/
  2. 2. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, et al. Mastering the game of Go without human knowledge. Nature. 2017 Oct;550(7676):354–9.
  3. 3. Mitchell M. Why AI is Harder Than We Think. 2021 Apr; Available from: http://arxiv.org/abs/2104.12871
  4. 4. Marcus G. Innateness, AlphaZero, and Artificial Intelligence. 2018 Jan; Available from: http://arxiv.org/abs/1801.05667
  5. 5. Sinz FH, Pitkow X, Reimer J, Bethge M, Tolias AS. Engineering a Less Artificial Intelligence. Vol. 103, Neuron. Cell Press; 2019.
  6. 6. Marcus G. Deep Learning: A Critical Appraisal. 2018 Jan; Available from: https://arxiv.org/abs/1801.00631
  7. 7. Rudin C, Radin J. Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson From An Explainable AI Competition. Harvard Data Science Review. 2019 Nov;1(2).
  8. 8. Corbett-Davies S, Goel S. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. 2018 Jul; Available from: http://arxiv.org/abs/1808.00023
  9. 9. Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D. Deep Reinforcement Learning that Matters. 2017 Sep; Available from: http://arxiv.org/abs/1709.06560
  10. 10. Duarte F. Harvard Data Science Review Data Science and Cities: A Critical Approach. 2020.
  11. 11. Chawla NV, Japkowicz N, Kotcz A. Editorial: Special Issue on Learning from Imbalanced Data Sets. SIGKDD Explor Newsl [Internet]. 2004 Jun;6(1):1–6. Available from: http://doi.acm.org/10.1145/1007730.1007733
  12. 12. García V. and Mollineda RA, S SJ. Index of Balanced Accuracy: A Performance Measure for Skewed Class Distributions. In: Araujo Helder and Mendonça AM, J. PA, Inés TM, editors. Pattern Recognition and Image Analysis. Berlin, Heidelberg: Springer Berlin Heidelberg; 2009. p. 441–8.
  13. 13. Pozzolo AD, Caelen O, Johnson RA, Bontempi G. Calibrating Probability with Undersampling for Unbalanced Classification. In: 2015 IEEE Symposium Series on Computational Intelligence. 2015. p. 159–66.