Software like Tilemaker and Planetiler is great for generating a complete set of tiles, updated about once a day, but they can’t handle minutely updates. Most users are fine with daily or slower updates, but OSM.org users are different, and minutely updates are critical for them. All the current minutely ways to generate map tiles involve loading the changes and regenerating tiles when data in them may have changed. I used osm2pgsql, the standard way to load OSM data for rendering, but the results should be applicable to other ways including different schemas.
Using the Shortbread schemea from osm2pgsql-themepark I loaded the data with osm2pgsql and ran updates. osm2pgsql can output a list of changed tiles (“expired tiles”) and I did this for zoom 1 to 14 for each update. Because I was running this on real data sometimes an update took longer than 60 seconds to process if it was particularly large, and in this case the next run would combine multiple updates from OSM. Combining multiple updates reduces how much work the server has to do at the cost of less frequent updates, and this has been well documented since 2012, but no one has looked at the impact from combining tiles.
To do this testing I was using a Hezner server with 2x1TB NVMe drives in RAID0, 64GB of RAM, and an Intel i7-8700 @ 3.2 GHz. Osm2pgsql 1.10 was used, the latest version at the time. The version of themepark was equivalent to the latest version
The updates were run for a week from 2023-12-30T08:24:00Z to 2024-01-06T20:31:45Z. There were some interruptions in the updates, but I did an update without expiring tiles after the interruptions so they wouldn’t impact the results.
To run the updates I used a simple shell script
1 2 3 4 5 6 7 8 |
|
Normally I’d set up a systemd service and timer as described in the manual, but this setup was an unusual test where I didn’t want it to automatically restart.
I then used grep to count the number by zoom in each file, creating lists for each zoom.
1 2 3 |
|
This let me use a crude script to get percentiles and the mean, and assemble them into a CSV.
1 2 3 4 5 6 7 8 |
|
A look at the percentiles for zoom 14 immediately reveals some outliers, with a mean of 249 tiles, median of 113, p99 of 6854, and p100 of 101824. I was curious what was making this so large and found the p100 was with sequence number 5880335, which was also the largest diff. This diff was surrounded by normal sized diffs, so it wasn’t a lot of data. The data consumed would have been the diff 005/880/336
A bit of shell magic got me a list of changesets that did something other than add a node: osmium cat 005880336.osc.gz -f opl| egrep -v '^n[[:digit:]]+ v1' | cut -d ' ' -f 4 | sort | uniq | sed 's/c\(.*\)/\1/'
Looking at the changesets with achavi, 145229319 stood out as taking some time to load. Two of the nodes modified were information boards that were part of the Belarus - Ukraine border and Belarus-Russia border. Thus, this changeset changed the Russia, Ukraine, and Belarus polygons. As these are large polygons, only the tiles along the edge were considered dirty, but this is still a lot of tiles!
After validating that the results make sense, I got the following means and percentiles, which may be useful to others.
Tiles per minute, with updates every minute
zoom | mean | p0 | p1 | p5 | p25 | p50 | p75 | p95 | p99 | p100 |
---|---|---|---|---|---|---|---|---|---|---|
z1 | 3.3 | 1 | 2 | 2 | 3 | 3 | 4 | 4 | 4 | 4 |
z2 | 5.1 | 1 | 2.6 | 3 | 4 | 5 | 6 | 7 | 7 | 10 |
z3 | 9.1 | 1 | 4 | 5 | 8 | 9 | 11 | 13 | 15 | 24 |
z4 | 12.8 | 1 | 5 | 7 | 10 | 12 | 15 | 20 | 24 | 52 |
z5 | 17.1 | 1 | 5 | 8 | 13 | 17 | 20 | 28 | 35 | 114 |
z6 | 21.7 | 1 | 6 | 9 | 15 | 21 | 26 | 37 | 48 | 262 |
z7 | 25.6 | 1 | 6 | 9 | 17 | 24 | 31 | 46 | 63 | 591 |
z8 | 29.2 | 1 | 6 | 9 | 17 | 26 | 34 | 55 | 92 | 1299 |
z9 | 34.5 | 1 | 6 | 10 | 18 | 28 | 37 | 64 | 173 | 2699 |
z10 | 44.6 | 1 | 7 | 10 | 20 | 31 | 41 | 80 | 330 | 5588 |
z11 | 65.6 | 1 | 7 | 12 | 23 | 35 | 49 | 125 | 668 | 11639 |
z12 | 111 | 1 | 8 | 14 | 29 | 44 | 64 | 238 | 1409 | 24506 |
z13 | 215 | 1 | 10 | 18 | 40 | 64 | 102 | 527 | 3150 | 52824 |
z14 | 468 | 1 | 14 | 27 | 66 | 113 | 199 | 1224 | 7306 | 119801 |
Based on historical OpenStreetMap Carto data the capacity of a rendering server is about 1 req/s per hardware thread. Current performance is slower, but includes The new OSMF general purpose servers are mid-range servers and have 80 threads, so should be able to render about 4800 tiles per second. This means that approximately 95% of the time the server will be able to complete re-rendering tiles within the 60 seconds between updates. A couple of times an hour it will be slower.
As mentioned earlier, when updates take over 60 seconds, multiple updates combine into one and reduce the amount of work to be done. I simulated this by merging every k
files together. Contuining the theme of patched-together scripts I did this with a shell script, based on StackExchange
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Running the results through the same process for percentiles generates numbers in tiles per update - but updates are half as often, so in terms of work done per time, all the numbers need to be divided by k
. For a few k
, here’s the results.
k=2
zoom | mean | p0 | p1 | p5 | p25 | p50 | p75 | p95 | p99 | p100 |
---|---|---|---|---|---|---|---|---|---|---|
z1 | 1.7 | 0.5 | 1 | 1 | 1.5 | 1.5 | 2 | 2 | 2 | 2 |
z2 | 2.5 | 0.5 | 1 | 1.5 | 2 | 2.5 | 3 | 3.5 | 3.5 | 5 |
z3 | 4.5 | 0.5 | 2 | 2.5 | 4 | 4.5 | 5.5 | 6.5 | 7.5 | 12 |
z4 | 6.4 | 0.5 | 2.5 | 3.5 | 5 | 6 | 7.5 | 10 | 12.5 | 26 |
z5 | 8.6 | 0.5 | 2.5 | 4 | 6.5 | 8.5 | 10 | 14 | 17.5 | 51 |
z6 | 10.9 | 0.5 | 2.9 | 4.5 | 7.5 | 10.5 | 13 | 18.5 | 24.5 | 107 |
z7 | 13.0 | 0.5 | 3 | 4.5 | 8.5 | 12 | 15.5 | 23 | 32 | 239 |
z8 | 14.9 | 0.5 | 3 | 4.5 | 9 | 13 | 17 | 27 | 50 | 535 |
z9 | 17.8 | 0.5 | 3 | 5 | 9.5 | 14 | 18.5 | 32 | 97 | 1127 |
z10 | 24 | 0.5 | 3 | 5 | 10 | 15.5 | 20.5 | 41 | 192 | 2347 |
z11 | 36 | 0.5 | 3.5 | 6 | 11.5 | 17.5 | 24 | 65 | 395 | 4888 |
z12 | 64 | 0.5 | 4 | 7 | 14.5 | 22 | 32 | 120 | 844 | 10338 |
z13 | 120 | 0.5 | 5 | 9 | 20 | 32 | 50 | 265 | 1786 | 22379 |
z14 | 263 | 0.5 | 7 | 14 | 33 | 56 | 99 | 617 | 3988 | 50912 |
k=5
zoom | mean | p0 | p1 | p5 | p25 | p50 | p75 | p95 | p99 | p100 |
---|---|---|---|---|---|---|---|---|---|---|
z1 | 0.66 | 0.20 | 0.40 | 0.40 | 0.60 | 0.60 | 0.80 | 0.80 | 0.80 | 0.80 |
z2 | 1.01 | 0.20 | 0.40 | 0.60 | 0.80 | 1.00 | 1.20 | 1.40 | 1.40 | 2.00 |
z3 | 1.82 | 0.20 | 0.80 | 1.00 | 1.60 | 1.80 | 2.20 | 2.60 | 3.00 | 4.60 |
z4 | 2.54 | 0.20 | 1.00 | 1.40 | 2.00 | 2.40 | 3.00 | 4.00 | 4.80 | 8.00 |
z5 | 3.40 | 0.20 | 1.00 | 1.60 | 2.60 | 3.40 | 4.00 | 5.40 | 7.00 | 18.80 |
z6 | 4.31 | 0.20 | 1.02 | 1.80 | 3.20 | 4.20 | 5.20 | 7.40 | 9.80 | 42.60 |
z7 | 5.08 | 0.20 | 1.20 | 1.80 | 3.40 | 4.80 | 6.20 | 9.20 | 12.60 | 93.60 |
z8 | 5.78 | 0.20 | 1.20 | 1.80 | 3.40 | 5.20 | 6.80 | 11.00 | 18.93 | 206.20 |
z9 | 6.78 | 0.20 | 1.20 | 2.00 | 3.60 | 5.60 | 7.40 | 13.00 | 35.40 | 430.40 |
z10 | 8.73 | 0.20 | 1.40 | 2.00 | 4.00 | 6.20 | 8.20 | 16.40 | 67.48 | 895.20 |
z11 | 12.76 | 0.20 | 1.40 | 2.40 | 4.60 | 7.00 | 9.60 | 25.16 | 150.32 | 1,865.40 |
z12 | 21.60 | 0.40 | 1.60 | 2.80 | 5.80 | 8.80 | 12.80 | 47.00 | 328.89 | 3,932.40 |
z13 | 41.88 | 0.40 | 2.00 | 3.60 | 8.00 | 12.80 | 20.60 | 102.08 | 712.36 | 8,486.80 |
z14 | 91.76 | 0.40 | 2.80 | 5.40 | 13.00 | 22.80 | 40.40 | 239.88 | 1,597.66 | 19,274.40 |
Finally, we can reproduce the Geofabrik graph, looking at tiles per minute with update interval and get approximately work ∝ update ^ -1.05
, where update is the number of minutes between updates. This means combining multiple updates is very effective at reducing load.
This has been a lot of numbers, which is useful for someone in my position, but what does this mean at a practical level?
Big updates happen sometimes, which will slow everything down. Even a powerful server will slow down when multiple large country borders need to be regenerated.
As update interval slows down, the tile server has less work to do and can catch up. Updates every 10 minutes involve approximately 5 times less work than minutely updates, so when a particularly large update happens, the server can easily catch up.
A lower-end server capable of 10 tiles/second can still update every 3 minutes or faster 95% of the time, 3-15 minutes 4% of the time, and only 1% of the time fall farther behind.
You probably don’t want to keep a minutely updated tileset running on your laptop.
With a bit of work, I manipulated the files to give me the usage from the 10 countries with the most usage, for the first four months of 2023.
Perhaps more interesting is looking at the usage for each country by the day of week.
]]>Tile storage is a difficult problem. For a tileset going to zoom 14, there are 358 million tiles, and for one going to zoom 15, there are 1.4 billion. Most tiles are smalled, with 80% being about 100 bytes typically, and the largest tiles might be about 1 megabyte.
Tilekiln’s storage must be able to handle these numbers, but also handle incremental minutely updates, and maintenance work like deleting tilesets. A nice to have would be the ability to distribute tilesets easily, but this is not essential.
PMTiles is a file format designed to store an entire tileset in one file. It consists of a directory, which lists offsets for where tiles are within the larger file. Using range requests, any tile can be retrieved in 3 requests in the worst case, while any caching at all will bring this to 2 requests, and typical caching can bring it close to one.
It features de-duplication, both for tiles that are bytewise-indentical, as well as for adjacent offset listings pointing at the same tile.
There is client-side support for some map browser-based display libraries, but most applications will require a server returning conventional that handles conventional z/x/y URLs serving from the PMTiles file. As a fairly new format, support from other applications is limited.
Updating the PMTiles archive in place is possible, because the clients use etags to identify when the archive has changed, invalidating the client-side cache. This means with minutely updates, every one minute, one request from each client will be the worst case, requiring 3 requests. In practice, this doesn’t matter, because for a large tileset, it is impossible to rewrite the entire archive that frequently, as it will take longer than that to write out the complete file.
Like PMTiles, MBTiles is a single-file archive format. It was developed by Mapbox for users to generate tiles and upload them to Mapbox’s servers. It’s format is a SQLite database with tables consisting of tile indexes and tile data data as binary blobs. Because it’s based on SQLite, and has been around for longer, support is wide-spread, with several generation. Browser-based support is limited, and it wasn’t designed with that in mind.
Minutely updates are theoretically possible, but in practice, not a good idea. SQLite databases do not work well with high volumes of concurrent reads and writes, generally requiring all work to go through one process. This requires coupling the generation and serving systems.
Because Tilekiln already requires PostgreSQL, it would be possible to store tiles in it, the same way that MBTiles does.
Instead of an archive format, it’s possible to store tiles on disk as files. This is the most well-established method, and simplest. Tiles can be updated atomically, and serving tiles is just serving files from disk. The downside comes to managing millions or billions of tiny files. File systems are not designed for this, and can have problems with
In particular, it can take a day or longer to delete a tileset.
A popular approach to store tiles in some form of object store, like S3. All commercial object stores I’ve looked perform badly with large numbers of small objects. While there are sometimes work-arounds for this, their pricing structure generally makes it very expensive to store tiles this way.
Tapalcatl 2 is a system of using zip files to combine tiles, reducing the number of tiles that need to be stored. It is similar to how raster tiles are combined into metatiles, except that the vector tiles are pre-sliced within the zipfile and can contain multiple zooms.
In a typical configuration, there are zip files generated for tiles on zooms 0, 4, 8, and 12. Each zip file contains the “root” tile and then tiles from the next three zooms that lie within it. This means that a zip archive contains 85 tiles, all tiles within a small area. By combining tiles into one zip archive, this reduces the number of files on disk to 16.8 million files, a small enough number to be reasonably managed on disk.
The format hasn’t had a great deal of usage since it was developed, so support is limited to some server-side programs that take tapalcatl archives and present tiles to the user. These server-side programs are known to have some issues, like not supporting updates to remote tapalcatl tilesets.
Updates are possible in two ways. The first is by taking an existing zip file, replacing the changed tiles within it, and generating a new zip file. The second is to completely regenerate all the tiles in the zip file, which is simpler, but involves more tile generation.
The two options which requires further investigation are PostgreSQL and Tapalcatl 2. Both support updates, but come with downsides.
]]>With the switch to a commercial CDN, we’ve improved our logging significantly and now have the tools to log and analyze logs. We log information on both the incoming request and our response to it.
We log
We log enough information to see what sites and programs are using the map, and additional debugging information. Our logs can easily be analyzed with a hosted Presto system, which allows querying large amounts of data in logfiles.
I couldn’t do this talk without the ability to easily query this data and dive into the logs. So, let’s take a look at what the logs tell us for two weeks in May.
Although the standard layer is used around the world, most of the usage correlates to when people are awake in the US and Europe. It’s tricky to break this down in more detail because we don’t currently log timezones. We’ve added logging information which might make this easier in the future.
Based off of UTC time, which is close to European standard time, weekdays average 30 000 requests per second incoming while weekends average 21 000. The peaks, visible on the graph, show a greater difference. This is because the load on weekends is spread out over more of the day.
On average over the month we serve 27 000 requests per second, and of these, about 7 000 are blocked.
Seven thousand requests per second is a lot of blocked requests. We block programs that give bad requests or don’t follow the tile usage policy, mainly
They get served
HTTP 400
Bad Request if invalid,HTTP 403 Forbidden
if misconfigured,HTTP 418 I'm a teapot
if pretending to be a different client, orHTTP 429 Too Many Requests
if they are automatically blocked for making excessive requests by scraping.Before blocking we attempt to contact them, but this doesn’t always work if they’re hiding who they are, or they frequently don’t respond.
HTTP 400 responses are for tiles that don’t exist and will never exist. A quarter of these are for zoom 20, which we’ve never served.
For the HTTP 403 blocked requests, most are not sending a user-agent, a required piece of information. The others are a mix of blocked apps and generic user-agents which don’t allow us to identify the app.
Fake requests get a HTTP 418 response, and they’re nearly all scrapers pretending to be browsers.
In July we added automatic blocking of IPs that were scraping the standard layer, responding with HTTP 429 IPs that are requesting way too many tiles from the backend. This only catches scrapers, but a tiny 0.001% of users were causing 13% of the load, and 0.1% of QGIS users causing 38% of QGIS load.
]]>The OpenStreetMap Standard Layer is the default layer on openstreetmap.org, using most of the front page. It’s run by the OpenStreetMap Foundation, and the Operations Working Group is responsible for the planning, organisation and budgeting of OSMF-run services like this one and servers running it. There are other map layers on the front page like Cycle Map and Transport Map, and I encourage you to try them, but they’re not hosted or planned by us.
At the high level, this is the overview of the technology the OWG is responsible for. The standard layer is divided into million of parts, each of which is called a tile, and we serve tiles.
OSM updates flow into a tile server, where they go into a database. When a tile is needed, a program called renderd makes and store the tile, and something called mod_tile serves it over the web. We have multiple render servers for redundancy and capacity. We’re completely responsible for these, although some of them run on donated hardware.
In front of the tile server we have a content delivery network. This is a commercial service that caches files closer to the users, serving 90% of user requests. It is much faster and closer to the users, but knows nothing about maps. We’re only responsible for the configuration.
The difference between the tile store and tile cache is how they operate, and size. The tile store is much larger and stores more tiles.
Only the cache misses from the CDN impose a load on our servers. When looking at improving performance of the standard layer, I tend to look at cache misses and how to reduce them.
The OWG has a tile usage policy that sets out what you can and cannot do with our tile layer. We are in principle happy for our map tiles to be used by external users for creative and unexpected uses, but our priority is providing a quickly updating map to improve the editing cycle. This is a big difference between the standard layer and most other commercially available map layers, which might update weekly or monthly.
We prohibit some acitivities like bulk-downloading tiles for a large area (“scraping”) because it puts an excessive load on our servers. This is because we render tiles on-demand and someone scraping all the tiles in an area is downloading tiles they will never view.
]]>I wanted to look at the correlation with OSM.org views. I already had a full day’s worth of logs on tile.openstreetmap.org accesses, so I filtered them for requests from www.openstreetmap.org and got a per-country count. This is from December 29th, 2020. Ideally it would be from a complete week, and not a holiday, but this is the data I had downloaded.
The big outlier is Italy. It has more visits than I would expect, so I wonder if the holiday had an influence. Like before, the US is overrepresented in the results, Russia and Poland are underrepresented, and Germany is about average.
Like before, I made a graph of the smaller countries.
More small countries are above the average line - probably an influence of Italy being so low.
]]>There’s lots of data for activity on OSM by country, but for this I took the numbers from joost for how many “active contributors” there are according to the contributor fee waver criteria.
For the larger countries, Russia is the most underrepresented country. This is not surprising, as they are underrepresented in other venues like the OSMF membership.
The US and UK are both slightly overrepresented in the survey, but less so than I would have expected based on other surveys and OSMF membership.
The smaller countries are all crowded, so I did a graph of just them.
As with other surveys, Japan is underrepresented. Indonesia, although underrepresented is less underrepresented than I would have expected.
]]>I’ve put up a world-wide demo at https://pnorman.dev.openstreetmap.org/cartographic/mapbox-gl.html, using data from 2020-03-16, and you can view the code at https://github.com/pnorman/openstreetmap-cartographic.
Only zoom 0 to 8 has been implemented so far. I started at zoom 0 and am working my way down.
Admin boundaries are not implemented. OpenStreetMap Carto uses Mapnik-specific tricks to deduplicate the rendering of these. I know how I can do this, but it requires the changes I intend to make with the flex backend.
Landuse, vegetation, and other natural features are not rendered until zoom 7. This is the scale of OpenStreetMap Carto zoom 8, and these features first appear at zoom 5. There are numerous problems with unprocessed OpenStreetMap data at these scales. OpenStreetMap Carto gets a result that looks acceptable but is poor at conveying information by tweaking Mapnik image rasterizing options. I’m looking for better options here involving preprocessed data, but haven’t found any.
I’m still investigating how to best distribute sprites.
The technology choices are designed to be suitable for a replacement for tile.osm.org. This means minutely updates, high traffic, high reliability, and multiple servers. Tilekiln, the vector tile generator, supports all of these. It’s designed to better share the rendering results among multiple servers, a significant flaw with renderd + mod_tile and the standard filesystem storage. It uses PostGIS’ ST_AsMVT, which is very fast with PostGIS 3.0. On my home system generates z0-z8 in under 40 minutes.
Often forgotten is the development requirements. The style needs to support multiple developers working on similar areas, git merge conflicts while maintaining an easy development workflow. I’m still figuring this out. Mapbox GL styles are written in JSON and most of the tools overwrite any formatting. This means there’s no way to add comments to lines of codes. Comments are a requirement for a style like this, so I’m investigating minimal pre-processing options. The downside to this will make it harder to use with existing GUI editors like Fresco or Maputnik.
The goal of this project isn’t to do big cartography changes yet, but client-side rendering opens up new tools. The biggest immediate change is zoom is continuous, no longer an integer or fixed value. This means parameters like sizes can smoothly change as you zoom in and out, specified by their start and end size instead of having to specify each zoom.
Have a look at https://github.com/pnorman/openstreetmap-cartographic and have a go at setting it up and generating your own map. If you have issues, open an issue or pull request. Or, because OpenStreetMap Cartographic uses Tilekiln have a look at its issue list.
]]>1 2 |
|
Not every user will want all the zooms, so I’m creating multiple tarballs, going from zoom 0 to zoom 6, 0 to 8, and 0 to 10. This duplicates data between the files, but makes them more useful since only one file needs downloading.
tar
will pack all of the tiles into one file, and can optionally compress them. Compressing a png won’t normally save space, but compressing a bunch of PNGs, many of which are identical will save space.
1 2 3 |
|
To make use of all the cores of my CPU, I’m going to use find
to locate the PNGs, then the program parallel
to have optipng
operate in parallel.
OptiPNG is a program that performs lossless optimization on PNGs. Because low-zoom tiles are more likely to be viewed and there’s fewer of them, I’ll call the program with different options, doing more aggressive optimizations on low-zoom tiles. There’s no magic right answer for much time to spend compressing, but I found these reasonable, and save up to 50% space on some zooms.
1 2 3 |
|
The space used can be measured with du -hsc --apparent-size osm_tiles/*
. --apparent-size
is essential since many of the tiles will be below the size of one block on disk.
All of this is of course not required, but helps a bit, and is an interesting experiement regardless.
]]>mapproxy-seed
program, using the previous config files. The only option needed besides config file locations is -c
which sets how many CPU threads to use. For the machine I’m using, 7 works best. Fewer leaves some capacity idle, while running with too many threads starves PostgreSQL and system of any CPU time.
1
|
|
How long this takes depends on to what zoom you’re seeding, and how powerful the server is. On my server it takes about four hours to seed to zoom 10.
]]>There’s a lot of documentation on MapProxy configuration files, and example ones can be created with mapproxy/bin/mapproxy-util create -t base-config
.
The first file is mapproxy.yaml
, which defines the layers to be rendered
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
This file can be tested with the command mapproxy/bin/mapproxy-util serve-develop mapproxy.yaml
, then the URL http://localhost:8080/tiles/1.0.0/osm/GLOBAL_WEBMERCATOR/0/0/0.png
should be a single tile covering the world.
The second file is seed.yaml
1 2 3 4 5 |
|
This sets up a seeding area covering the entire world from zoom 0 to zoom 8. The seeding can be run with mapproxy/bin/mapproxy-seed -s seed.yaml -f mapproxy.yaml
and the -c
option can be added to set parallelism. After this is done, the tiles are generated, they just need to be packaged.
While I was working at the Wikimedia Foundation, I developed brighmed, a CartoCSS style using vector tiles. Wikimedia decided not to flip the switch to deploy the style, but the style is open source, so I can use it elsewhere. Making this decision, I spent a day implementing most of it in Tangram.
What’s next?
I’ve got some missing features like service roads and some railway values to add, then I can look at new stuff like POIs. For that I’ll need to look at icons and where to fit them into colourspace.
There’s a bunch of label work that needs to be done, what I have is just a first pass, and some things like motorway names have big issues, and ref tags still need rendering. Label quality is of course a unending quest, but I should be able to get some big gains without much work.
Richard is planning to do some work on writing a schema, and if it works, I’d like to adopt it. At the same time, I don’t want to tie myself to an external schema which may have different cartographic aims, so I’ll have to see how that works out. Looking at past OpenStreetMap Carto changes to project.mml
, I found that what would be breaking schema changes on a vector tile project are less common than I thought, happening about once every 4-6 months. Most of the schema changes that would have happened were compatible and could be handled by regenerating tiles in the background.
The website code is mainly in Ruby on Rails, and you need to know this before starting the project. JavaScript is a good idea, as one implementation route requires client-side JavaScript changes.
It may seem odd for the first step of a coding project to have nothing to do with coding, but it’s essential. You need to learn about OSM’s data model, architecture, and what it’s used for, and the fastest way to do this is with by mapping. You’ll also be looking at how editing software interacts with the API. It doesn’t matter too much what you map, but I’d suggest around your university, a past job, or some other area you’re familiar with.
Matt Amos wrote a blog post on API changes which puts this project into a wider context. Most of the work there isn’t part of the GSOC project, but it helps understand why we want to do this project.
The API documentation covers all of the API calls, but the ones that are particularly important for the project are the read calls for elements, full versions for ways and relations, ways for node call, relations for element, read and download calls for changesets, and read note call.
The map call, and changeset model are also important concepts to understand.
Start JOSM with a console window open, and will show all the API calls it makes. When you’ve done this, edit some more. Make sure to use the show object, show object history, download relation, and other tools that download data. Watch what API calls are made, compare them against the API documentation, and understand what it’s doing.
There’s a few ways to get object information. The obvious one is the “browse” pages at https://www.openstreetmap.org/way/<N>
, but also include history view in JOSM and OSM Deep History. The first page doesn’t use the API and the second two do. The goal of this project is to make the first page use the API.
The next two steps are a form of homework and necessary to write your proposal. Look at a the node browse page for node 5324545411. Write down what API calls are needed to get all the information on it. It should be possible to do it in a fixed number of API calls, in this case four calls.
For some browse pages it’s not possible to get all the information in a fixed number of API calls. Take a look at way 471813907 and see what infomation is missing or would require recursive API calls. Part of the project will be proposing and implementing new API calls to fill the missing needs.
Some more background is found in some emails from a year ago
OpenStreetMap Carto generates a Mapnik XML stylesheet, which can be used by any software that includes Mapnik. Some of the common options are
None of these options is perfect for anything. For this particular use the requirements are
The options which meet this are:
MapProxy or accessing the Mapnik API directly are the best two options. It’s a lot easier to set up MapProxy than write new code, so that’s the option I’ll go with.
With MapProxy selected, we need to install it. Unfortunately, this requires installing Mapnik. Mapnik has a reputation of being difficult to compile, having an API that changes between versions when it shouldn’t, poor support for bindings for other languages, versioning problems, and generally being tricky to work with. This reputation is accurate.
If I were trying to install Mapnik on anything other than a Debian system, it would be tricky, but I can use the excellent work of the Debian GIS team. All that’s needed is apt-get install libmapnik3.0 mapnik-utils python-mapnik
, and the required software is there. In addition to Mapnik, the virtualenv
package provides virtualenv, a program for isolated Python environments.
The install script is a simple two lines
1 2 |
|
The first line creates a virtualenv named mapproxy that has access to the system Python packages, most importantly Mapnik. The second installs MapProxy 1.11 in it.
]]>Loading can easily be done on a single CPU server and the RAM needed is less than you want for caching later on.
Like before, the first step is setting some variables.
1 2 3 4 5 6 |
|
Next, a database is needed. OpenStreetMap Carto documents what extensions are needed by it, so we just need to follow those directions.
1 2 3 4 |
|
OpenStreetMap Carto needs data loaded with osm2pgsql, like most styles. The osm2pgsql options can be broken down into three groups: style settings, performance, and locations.
The style settings control how the data in the database is represented. These are given by the style. We don’t have to know what they mean, so we just have to use what OpenStreetMap Carto’s documentation says: -G --hstore --style openstreetmap-carto.style --tag-transform-script openstreetmap-carto.lua
The locations are where to get the OSM data, database names, and other information that relates to where to read and save everything.
Performance options are the only ones that require some judgement to set. Because this script is intended for the full planet, we use --slim --flat-nodes ${FLAT_NODES}
, just like the osm2pgsql documentation suggests. Also, we know the database will not be updated with --append
, so we can use the --drop
option, which skips indexing the slim tables and drops them instead, saving time and space.
We need to set the how much memory is used to cache node positions. This should never be set so high that the server runs out of RAM, but there’s no gain to setting it to more than is needed to cache every node. A general rule of thumb is to set it to 75% of RAM size, in MB. With the size of the planet right now, I also know that it doesn’t need more than 40GB, but this is subject to change.
This results in the osm2pgsql command
1 2 3 4 5 6 7 |
|
On a SSD-based server with 64GB RAM, this should take 10-20 hours to process the planet. On a tuned server with NVMe drives, it can be under 5 hours.
Last is building some indexes the stylesheet relies on. Normally we could use the indexes.sql
file that is part of OpenStreetMap Carto, but because this database isn’t going to be updated, the fillfactor option can be set to build more efficient indexes
1
|
|
Rearranging the order of some commands and adding cleanup, we get a script that we can run.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
Edit: Information about indexes added
]]>For the script, we’re going to assume that the carto
binary is in the PATH. Unfortunately, this requires installation, which requires npm, which itself needs to be installed.
Given nodejs and npm is a huge headache of versions, the easiest route I’ve found is to install nvm, then install nodejs 6 with nvm install 6
. CartoCSS is then installed with npm install -g carto
.
The shell script starts off with some variables from last time.
1 2 3 |
|
OpenStreetMap Carto is hosted on Github, which offers the ability to download a project as a zip file. This is the logical way to get it, but isn’t usable from a script because the internal structure of the zip file isn’t easily predicted. Instead, we’ll clone it with git, only getting the specific revision needed.
1 2 3 4 5 |
|
Setting advice.detachedHead=false
for this command avoids a warning about a detached HEAD, which is expected.
OpenStreetMap Carto sets the database name to be “gis”. There are various ways to override this for development, but in this case we want to override it for the generated XML file. Fortunately, the database name only appears once, as dbname: "gis"
in project.mml. One way to override it would be to remove the line and rely on the libpq environment variables like PGDATABASE
. Another is replacing “gis” with a different name. It’s not clear which is better, but I decided to go with replacing the name, using a patch which git applies.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
With project.mml patched, it’s easy to generate the Mapnik XML, because CartoCSS was installed earlier.
1
|
|
Lastly, OpenStreetMap Carto needs some data files like coastlines. It comes with a script to download them, so we run it.
1
|
|
Taking all of this and re-arranging it as, we end up with the following script.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
curl
or wget
, or you can download it from the browser. If you want to script it, it’s a bit harder. You have to worry about error conditions, what can go wrong, and make sure everything can happen unattended. So, to make sure we can do this, we write a simple bash script.
The goal of the script is to download the OSM data to a known file name, and return 0 if successful, or 1 if an error occurred. Also, to keep track of what was downloaded, we’ll make two files with information on what was downloaded, and what state it’s in: state.txt
and configuration.txt
. These will be compatible with osmosis, the standard tool for updating OpenStreetMap data.
Before doing anything else, we specify that this is a bash script, and that if anything goes wrong, the script is supposed to exit.
1 2 3 |
|
Next, we put the information about what’s being downloaded, and where, into variables. It’s traditional to use the Geofabrik Liechtenstein extract for testing, but the same scripts will work with the planet.
1 2 3 4 |
|
We’ll be using curl to download the data, and every time we call it, we want to add the options -s
and -L
. Respectively, these make curl silent and cause it to follow redirects. Two files are needed: the data, and it’s md5 sum. The md5 file looks something like 27f7... liechtenstein-latest.osm.pbf
. The problem with this is we’re saving the file as $PLANET_FILE
, not liechtenstein-latest.osm.pbf
. A bit of manipulation with cut
fixes this.
1 2 3 |
|
The reason for downloading the md5 first is it reduces the time between the two downloads are initiated, making it less likely the server will have a new version uploading in that time.
The next step is easy, downloading the planet, and checking the download wasn’t corrupted. It helps to have a good connection here.
1 2 3 |
|
Libosmium is a popular library for manipulating OpenStreetMap data, and the osmium command can show metadata from the header of the file. The command osmium fileinfo data.osm.pbf
tells us
1 2 3 4 5 6 7 8 9 10 11 |
|
The osmosis properties tell us where to go for the updates to the data we downloaded. Despite not needing the updates for this task, it’s useful to store this in the state.txt
and configuration.txt
files mentioned above.
Rather than try to parse osmium’s output, it has an option to just extract one field. We use this to get the base URL, and save that to configuration.txt
1 2 |
|
Replication sequence numbers needed to represented as a three-tiered directory structure, for example 123/456/789
. By taking the number, padding it to 9 characters with 0s, and doing some sed magic, we get this format. From there, it’s easy to download the state.txt
file representing the state of the data that was downloaded.
1 2 3 |
|
After all this has been run, we’ve got the planet, it’s md5 file, and the state and configuration that correspond to the download.
Combining the code fragments, adding some comments, and cleaning up the files results in this shell script
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
Over the next few posts, I’m going to be walking through step-by-step on how to generate these files, starting with downloading OpenStreetMap data, and ending up with rendered tiles.
Language: nodejs
Layer definitions: Mapnik layer definitions in XML, typically preprocessed from YAML
Vector tile formats: Mapbox Vector Tiles
Data source support: PostGIS
Kartotherian, tessera, and other servers based on tilelive all rely on Node bindings to Mapnik to produce vector tiles. They all work with Mapnik layer definitions. This is a reasonably well understood language and consists primarily of a SQL statement for each layer. This is reasonably flexable and it’s possible to do proper code review, git conflict resolution, and other processes you need with an open style.
Some servers can turn the Mapbox Vector Tiles into GeoJSON, but not all do. There are other minor differences, but they all have the same major advantages and disadvantages.
The biggest problem with these options is that you have to either use the exact same versions of everything as the Mapbox developers while hoping their changes work with your code, or lock down your versions to a set of known good versions and periodically update when you need new features, retesting all your code. Neither of these is practical for an open-source style which wants to involve others.
If you don’t do this, you’ll find parts of your server failing with different combinations of Mapnik and node-mapnik.
Language: Python
Layer definitions: SQL in jinja2 templates, YAML
Vector tile formats: Mapbox Vector Tiles, TopoJSON, and GeoJSON
Data source support: PostGIS
Tilezen tileserver was written by Mapzen to replace their TileStache-based vector tile generation. Having been written by developers who wrote previous vector tile servers, it combines ideas and functionality other options don’t have.
The datasource definitions are written in SQL + YAML, a common choice, but unlike other options, the SQL is in its own files which are preprocessed by the jinja2 templating engine. This adds some complexity, but a great deal of power. Selecting different features by zoom level normally requires repetative SQL and lengthy UNION ALL queries, but the preprocessing allows queries to be written more naturally.
Tileserver’s unique feature is the post-processing capabilities it offers. This allows vector tiles to be operated on after the database, altering geometries, changing attributes, and combining geometries. Post-processing to reduce size is a necessary feature if targeting mobile devices on slower connections. Mapbox had been working on this in the open, but now that they no longer use node-mapnik it’s not clear how they do so. MapQuest had developed Avecado to specifically target this, but it became abandoned when they stopped doing their own map serving.
You don’t need any AWS services for a basic Tilezen tileserver deployment, but there might be some dependencies in the more advanced features needed to set up a full production environment.
Language: Go
Layer definitions: SQL in TOML
Vector tile formats: Mapbox Vector Tiles
Data source support: PostGIS
Tegola is a new server written in Go. It operates with multiple providers which supply layers to maps, allowing them to be assembled different ways. It looks like it has most of the features needed for vector tiles for a basemap, but might be missing a few needed for changing data as zoom changes.
SQL in TOML is similar to SQL in YAML for layer definitions, and like this it is reasonably flexable and makes it possible to do proper code review, git conflict resolution, and other processes you need with an open style.
I haven’t had a chance to deploy it yet, so I’m not sure what difficulties there are.
Language: Rust
Layer definitions: SQL in TOML
Vector tile formats: Mapbox Vector Tiles
Data source support: PostGIS
t-rex is a new server written in Rust. It’s unique feature it that it can auto-configure layers from PostGIS tables. It does have all the required features for selecting appropriate data in a basemap.
It’s layer definitions are different than Tegola’s, but they are both SQL in TOML, and share the same strengths.
Like Tegola, I haven’t had a chance to deploy it.
Language: Python
Layer definitions: SQL in JSON
Vector tile formats: Mapbox Vector Tiles, TopoJSON, GeoJSON, and Arc GeoServices JSON
Data source support: PostGIS
TileStache is a general-purpose tile server which Mapzen used to use a fork of to serve their Tilezen schema. They’ve switched to Tilezen tileserver, but the functionality they added has been merged back into TileStache. Unfortunately, the documentation hasn’t caught up yet, so there’s not too much information about all of its functionality.
Deploying TileStache tends to be reasonable - particularly compared to node-mapnik - but the language of SQL in JSON is one that’s a problem for open projects with multiple authors and prevents proper code review and git conflict resolution.
Language: C++
Layer definitions: Lua
Vector tile formats: Mapbox Vector Tiles
Data source support: OSM PBF and shapefiles
Tilemaker is built around the idea of vector tiles without a serving stack. It does this by doing an in-memory conversion directly from OSM PBF data to pre-generated vector tiles, which can then be served using Apache, a S3 bucket, or any means of serving files from disk. This vastly simplifies deployment and reduces sources of downtime.
For serving a city or most countries this can be the ideal method, but the same strengths that make it good for this are a problem for processing the planet. It takes large amounts of RAM, can’t consume minutely changes, and has to create vector tiles for the entire PBF at once.
Tilemaker is also the only server to support directly using shapefiles for low zoom data and OSM for high zoom. Other options require loading into PostGIS and using SQL that selects the appropriate data based on zoom.
Language: Python
Layer definitions: osmfilter options
Vector tile formats: o5m
Data source support: OSM PBF and other raw OSM data
VectorTileCreator is part of KDE Marble and takes the unique approach of creating tiles of raw OSM data. It uses osmfilter’s language for filtering OSM data, but lacks the means to use other data sources, something most maps will need. The support of o5m vector tiles is also limited. Like tilemaker it runs from the command line and produces a set of vector tiles.
What you should use depends on your needs. First figure out what support you need for the full planet, updates, data sources, and output formats. If you need diff update support, then you need something that can create a single vector tile and Tilemaker won’t work. If you need TopoJSON support, node-mapnik won’t work.
Server | Full planet | Diff updates | Non-OSM data | GeoJSON | TopoJSON | Mapbox Vector Tiles |
---|---|---|---|---|---|---|
node-mapnik | Yes | Yes | Yes | Some | No | Yes |
Tilezen tileserver | Yes | Yes | Yes | Yes | Yes | Yes |
Tegola | Yes | Yes | Yes | No | No | Yes |
t-rex | Yes | Yes | Yes | No | No | Yes |
TileStache | Yes | Yes | Yes | Yes | No | Yes |
Tilemaker | No | No | Yes | No | No | Yes |
VectorTileCreator | Unknown | No | No | No | No | No |