For the last little while I’ve been working on importing the addresses from the City of Surrey GIS data to OpenStreetMap, and thought I’d share my thoughts after having successfuly completed the import. Because this import was in an area with no addresses previously in the database, this import in many ways was as simple as possible as I didn’t have to worry about colliding with existing data. Address data is also very well-suited for imports as no one really likes collecting this data, even if it is very important.
For many imports, this section would be a lengthy discussion of the license and compatibility, but not so here. Surrey has taken the forwards-thinking step of releasing their GIS data under the Public Domain Dedication and License, an excellent license for municipal data. As rweait said on talk-ca, this is good not just for OSM, but good for the citizens of Surrey as it allows anyone to use the data for their project.
How Surrey Helped
Surrey’s GIS department was very helpful in preparing this import and understanding their data which was essentially a direct dump from their internal database. One of the changes I made after talking to them was to use the ADDRID field instead of the GLOBALID field, a change that should make future additions easier.
I elected to use ogr2osm to convert, using a virtual machine running debian. I did this not because it was the easiest, but because it uses a python function for tag translation. This let me turn road names like “OCEAN PARK RD” into “Ocean Park Road.”
Uploading nearly 100k nodes
I did the actual upload by splitting the .osm file into parts with JOSM and uploading with JOSM, with each part consisting of approximately 20k nodes, uploaded in pieces of 500. I had one error on the second upload which I had to revert and re-upload. It took over a day to get everything uploaded since I had to time my uploads for off-peak hours. If I were doing it over again, I’d use Upload.py or one of the other scripts.
Overall, the import was a success and saved collecting about a hundred thousand address nodes.