Paul’s Blog

A blog without a good name

OpenStreetMap Active Users

Periodically people make the claim of over 2 million active users for OpenStreetMap, but what this mean? This is the total number of accounts, including those who never edited, those who left long ago, spammers, and actual active contributors.

The closest metric to a standard is active users over the last 30 days. Although we can’t get that number, we can look at the changeset dump and analyze it with ChangesetMD and some SQL.

New Drive Testing

I had to buy a new hard drive for my array recently, which meant verifying that it works before I put it into service.

I don’t do burn-in tests of drives. Drives have a bathtub curve for reliability, like most components, but I find that if a drive is failing, it will start exhibiting performance problems that a thorough testing reveals.

Running a sufficiently long burn-in is increasingly impractical. A burn-in would probably involve writing and reading everywhere on the disk multiple times, and disks have been getting bigger and bigger. Denser platters help with sequential speeds, but I’d estimate it would take several days to burn in a new drive.

OpenStreetMap Carto Complexity

I often refer to OpenStreetMap Carto as the largest most complex open multi-contributor map style, but what does that mean?

Broken down, it means

  • It’s the largest open stylesheet. If you measure in code size, features rendered, or complexity, nothing else is close;

  • It’s the largest multi-contributor map style that doesn’t have a company dictating what is worked on. This means we get merge conflicts. They got so bad we changed the technology we use to define layers to make them solvable; and

  • It’s the largest style using OpenStreetMap data. Some proprietary styles like OpenCycleMap, MapQuest Open, and Mapbox Streets are complex, but none of them render the range of features we do.

More OpenStreetMap Futures

Andy recently blogged the developer numbers from his OpenStreetMap Futures talk at SOTM US.

Wanting to play with the numbers myself, I took the osm100 code and added in additional projects. The original list of repos came from a list of “Core Software” from the Engineering Working Group, and since then some of the software has been replaced, and there’s other older software which used to be core, but isn’t.

Optimizing Osm2pgsql CXXFLAGS

When testing an osm2pgsql bug report I did some testing of osm2pgsql node parsing speed and various CXXFLAGS. CXXFLAGS is a variable that can tell the compiler to apply various optimizations when compiling the code, potentially resulting in speed increases.

The two flags I tested were -O and -march. There are countless others, and many that are not optimization related, but these two are all that need to be adjusted. More flags are not generally useful.

Running the Redaction Bot

The redaction bot is a piece of software designed to remove legally problematic data from OpenStreetMap, generally data that violates copyright.

It’s not often needed, as OpenStreetMap users are careful about copyright, but it is available if needed. Requiring special knowledge and API permissions, a grand total of one person runs it regularly – me. This post might not be the most interesting for others…

Real Performance Numbers for a Style

The tile list used for performance testing a stylesheet is critical. It needs to represent a realistic mix of zooms and tile complexity, and be large enough to have reasonable caching behavior. The best source for a tile list is logs from a real rendering server.

Avecado and a Stylesheet

My last post was about about how to install Avecado and that it met the needs for some OpenStreetMap Carto benchmarking, I didn’t cover how to set it up with a stylesheet. The basic idea is simple, but some adapations need to be made so the benchmark workload matches a real-world workload.

The avecado_server program has a simple built-in HTTP server for testing purposes that can be used with a HTTP client like curl to produce a load on the database. It lacks sophisticated caching but any caching would need to be disabled for this benchmarking.