19th Apr, 2021Gareth Simons

cityseer-api, I suppose like many software packages, emerged somewhat unintentionally. In 2013, obsessed with complex systems, I checked-out of a decade in architecture and set off for London where I enrolled in the MRes in Advanced Spatial Analysis and Visualisation, the precursor to the current line of course offerings at the Bartlett Centre for Advanced Spatial Analysis (CASA). My MRes dissertation: A Computational Implementation of Jane Jacob’s Generators of Diversity, was my first-stab at using Python to calculate network centralities and mixed-uses en-masse for towns and cities in the UK. With the MRes completed, I went about working for CASA where I further embraced Python while creating an asynchronous pipeline linking PostGIS (Postgres) databases to an interactive Unity3D interface (2014-2016). This later overlapped with a stint working for Woods Bagot’s Superspace team (2016-2017), where we developed workflows consisting of Python, PostGIS, and web dashboards for urban analysis; in essence, pulling-in various sources of open data, storing these in a PostGIS database where we could run some analysis, and then figuring out how to display this information to designers and clients.

By this point, I’d developed a deep affinity for Python and PostGIS and their tremendous versatility in the realm of all things geospatial and data-sciency. Other than the adoption of PostGIS, my toolset was still essentially the same as what I’d used for the MRes: a mix of an off-the-shelf network analysis package in the form of the performant graph-tool — which wraps heavily optimised C++Boost Graph Library under-the-hood — combined with numbaJIT compilation for when I needed to resort to implementing customised loops by-hand while retaining some measure of speed. This generally worked as intended and I was able to muddle-along, but while working on my PhD (2016-2021, also at CASA) I started to realise, as I looked into the arcane specifics of mixed-use analysis measures, that all was not what it seemed… The mixed-use methods I had favoured at the time didn’t behave in a mathematically consistent manner on windowed graphs and therefore couldn’t be reliably compared from location to location; to make things worse, the more I looked into centralities the more it sunk-in that conventional formulations of closeness centralities also behaved counter-intuitively on windowed graphs, thus adding to a growing list of network analysis related questions, all of which seemed to have no clear-cut answers: Which centrality measures to use? Whether to use the primal or dual network? Shortest vs. simplest-path heuristics? And, at which distance thresholds? This was further compounded by a list of issues relating to algorithmic implementations of network analysis specific to urban systems: Do you window the graph using network distances or crow-flies distances1? How do you stop the algorithm from short-cutting sharp corners when using simplest-path methods2? How do you assign land-uses to the network, or overload network algorithms to calculate custom network centralities, landuse-accessibilities, mixed-use measures, and spatially aggregated statistical measures? How do you reduce topological distortions and abstract the handling of network geometries and the ensuing calculation of simplest-path and shortest-path distances? The PhD provided an opportunity to work through these issues in a methodical manner, and inevitably meant that I had reached a point where everything was being done ‘by-hand’ and from first-principles in numba. Piggy-backing an off-the-shelf network analysis package at this level of customisability had become an overly difficult strategy and meant that any speed advantages were inevitably sacrificed due to encroachment of C++ from the realm of Python.

The heart-and-soul of what would become the cityseer-api package emerged around experimentations with a dijkstra3 shortest-path tree search algorithm; because this was implemented by-hand, it had become possible to take any number of customised considerations and data structures into account. For example, to terminate a graph-search when encountering a distance threshold; or to add a parameter to trigger checks for prevention of side-stepping of sharp angular turns; or to calculate angular centralities on-the-fly, whether on the primal or the dual graph; or to extend these workflows for the calculation of landuse or statistical measures. Whereas this had allowed me to start dealing with these issues in a rigorous manner while comparing resultant measures on a like-for-like basis, I inevitably started running into the next problem: an ever-growing level of code-base complexity which had begun to paralyse further progression. At this point, the data-structures were still being created by-hand, a process involving the loading of Ordnance Survey data from databases, abstraction of information from roadway geometries, and converting this information into data structures that could be processed by the underlying methods. These workflows became increasingly difficult to maintain and it was hard to touch anything without breaking something somewhere else, or having to copy-and-paste bits here-and-there only to realise something else needed tweaking and that one had to rerun days’ worth of analysis. Inevitably, and after much rumination, this prompted the formalisation of the code-base into a package which became the cityseer-api, at which point networkX based workflows to automate the preparation of graphs and their conversion into numba data structures were also developed. What I only fully appreciated later-on is that the exercise of abstracting and formalising the logic from first-principles not only allowed me to explore algorithmic specifics in a very controlled manner, but made my life far easier in the long-run because the code was now documented and tested, meaning that I could continue to experiment and develop the code with a higher level of confidence that I wasn’t breaking things along the way, while also being able to go back and read the documentation by the time I inevitably forgot what the parameters and logic for various workflows and functions had been.

So, with that somewhat long-winded introduction out of the way, I’ll point directly to the documentation where you can learn more about the package and how it can be used. Questions can be asked in the repo’s Discussions area and the code can be found at cityseer-api. The broader background to some of the theoretical considerations can be found in the associated paper: The cityseer Python package for pedestrian-scale network-based urban analysis.

  1. 1. Cooper CHV. Spatial localization of closeness and betweenness measures: a self-contradictory but useful form of network analysis. International Journal of Geographical Information Science [Internet]. 2015;29(8):1293–309. Available from: https://www.tandfonline.com/doi/pdf/10.1080/13658816.2015.1018834?needAccess=true
  2. 2. Turner A. From axial to road-centre lines: a new representation for space syntax and a new model of route choice for transport network analysis. Environment and Planning B: Planning and Design [Internet]. 2007;34:539–55. Available from: http://journals.sagepub.com/doi/pdf/10.1068/b32067
  3. 3. Dijkstra E. A Note on Two Problems in Connection with Graphs. Numerische Mathematik [Internet]. 1959;1:269–71. Available from: http://eudml.org/doc/131436