CycleStreets will shortly be hiring a full-time core developer to work on our routing engine and website. The aim is to enable us to implement improvements to the routing and implement lots of feature requests a lot more quickly.
This is possible thanks to a series of contracts and grants we've recently obtained, some of which we've recently reported about on this blog.
Our core aim with CycleStreets is to create "routing that thinks like a cyclist". This means including some things, such as turn delays and complex rules that would seem to be challenging to implement with off-the-shelf engines.
We're therefore wanting to take someone on who is able to work on what are interesting computer science problems (as distinct from purely "web development"). However, we'd hope that a suitable person can help work on the website (e.g. backend/frontend integration, general web development. refactoring) as well.
Background info: about our routing engine
The CycleStreets routing system finds, quietest, fastest and balanced cycling routes between two points in the British Isles. It does this by building a route planning version of the map (known as a graph) for each of these types of route.
Each graph consists of vertices and edges that correspond to junctions and streets. On each graph the length of an edge corresponds to the cost for that route type of travelling along a street. So, roads and wide cycle tracks will have relatively short links on the graph for fastest routes, but the links that represent footpaths and bridleways where the cycling speed is slower will be longer. By contrast, on the graph for quietest routes, cycleways and park paths will be short relative to busy roads.
The optimal route of a certain type will then be the shortest path on the graph for that type.
So there are two main parts to the CycleStreets routing system: (1) Building the graphs (2) Finding the shortest path. The quality and performance of CycleStreets is governed by how well we do 1 and 2 respectively.
The current OpenStreetMap (OSM) map of UK & Ireland contains 2 million streets that pass through 20 million points. Streets are usually represented by their centre line, with nodes at junctions. The characteristics of each street, including for example the type of road, whether it is hilly, has a cycle lane, or is a one-way street are used to measure the suitability for cycling. In graph theory speak, they translate into the cost of an edge.
These costs are measured during our daily translation of OSM's UK & Ireland map into a graph by application of a series of rules.
That system is currently rather more complex and opaque than we'd like. Choosing the right values for all of those numbers has been an iterative process as we've tuned the routing engine, often in response to our user's feedback.
We need to:
- Review the ruleset, re-examining the assumptions they are based on and the order they are applied
- Make it much clearer to users how the rules have influenced the chosen route (this will confer greater confidence in the generated advice)
- Include additional sources of information (more information to be published soon on this)
- Handle junctions better, which is of particular interest currently.
Here in more detail is discussion on some of these:
Junctions – reducing wiggly routes
Simple graphs constructed from maps of street centre-lines lack are limited in their capacity to express how easy it is to cycle through a road junction. This series of photos introduces the problem.
In essence, the simple graph cannot represent banned turns, and nor can it indicate that a right turn across a busy road might incur quite a wait until it is safe to cross. However, some of the turn delay modelling can be derived automatically, knowing the 'before' and 'after' street types, and the turn angle. We believe that accurately representing that sort of information has the best chance of improving the quality of routes recommended by CycleStreets.
The way this route wiggles on and off Glasgow Road is a good example of this problem. (NB Click on the + button in the top-right corner of the map and switch to the OpenCycleMap view to get a better map background.)
Implementing turn delays at every node presents real challenges in terms of the use of a standard A* routing engine. This is further more difficult given the need to avoid very large amounts of RAM being needed.
Solving this problem is a significant technical challenge. It needs to be transparent so that there is a clear link between each piece of data and each rule used in the solution and the decisions that a knowledgeable cyclist would make. The solution must also be scalable, so that neither the amount of time taken to find a route nor the amount of data required to represent the problem, increase significantly. That will make it attractive and manageable. A solution also needs to avoid creating excessive hardware cost requirements.
Finding the shortest path
Shortest path algorithms are very familiar to computer science students, and there are many well established standard techniques that solve the problem. CycleStreets currently uses the A* algorithm, which is implemented in a combination of a Python program and a C++ module.
The 2 million streets and 20 million points of the map, become 5 million edges, and 4 million vertices in a graph. These are compressed by a non-standard technique, called Cell Optimisation (Cello), into 2 million edges and 1 million vertices.
The Python/C++ program can scan the compressed graph to find a route from Land's End to John O'Groats in 2.1 seconds. The vast majority of routes we're interested in are about 10 miles, which can be solved by this program in milli-seconds.
The process of interpreting the generated route from a list of edges and vertices into an itinerary that includes street names and turn directions – the 'unpacking stage' – and storing in a database takes much longer. Performance seems to vary quite a bit, sometimes its very quick but it can take several seconds to process, even for a typical-sized route.
This issue will grow in significance as the number of users of CycleStreets continues to grow, and as new ways of using the routing system emerge.
We want to make the performance aspect of the routing itself, as well as the 'unpacking' stage, as fast and efficient as possible.
More types of route
There is a growing clamour for many more customisable types of routing. Ultra-quiet routes, routes where there is no dismounting and routes that are indifferent about hills.
We should also consider taking into account intermittent links, such as ferry crossing times, or gates that are closed at night.
Our system uses pre-compiled graphs, so each new routing type potentially means adding a new graph. Ulitmately RAM sizeputs a cap on that approach.
Are you the kind of person who can tackle these interesting problems?
We think that these and other challenges form interesting computer science problems.
Do get in touch if you'd be interested in an informal discussion to discuss things further. A formal job description and other details, including application deadlines and details will be online very shortly.