Part of optimizing a style or application using PostgreSQL is finding what queries are taking up the most time, and improving them. The first part of this is finding the slow queries.
Logging slow queries gives a misleading view on what is using the most resources. A query that takes 100 seconds to run but only runs once a day will use less resource than a 100ms query running every second, but the fast often repeated query might not appear in the logs. This is common with map rendering where low-zoom queries are slow but infrequently run.
Logging all queries is comprehensive but can have a noticeable performance impact. Every query causes a write to the server logs. This can be an unacceptable performance drag on a production server.
Both methods suffer from adjusting log file settings. This means adjusting the server configuration, having the setting apply to all databases, having to remove extraneous information from the logs, and generally being hard to parse. In a shared environment this can be impossible to get done. If you have statements with newlines the newlines make it into the logfile unescaped, making it hard to parse them with normal tools.
Fortunately, there is another method: sampling running queries.
pg_stat_activity is a system view contains information about active server processes, including the queries they are running. By periodically quering this we can gather a sample of what queries are running and how often. This method doesn’t come with a downside.
pg_stat_activity will cut off very long queries, something which is common with stylesheets where a complex layer can involve several KB of SQL. Fortunately with rendering, we can identify the queries by looking at the style definitions.
Not all the columns of
pg_stat_activity are useful and we want to only get queries on the rendering database, and we can do this with some SQL:
1 2 3 4 5 6 7 8 9
This restricts what we are interested in, and instead of when the query was started we have how long it’s been running. Because the query could contain newlines we wrap the entire statement in a
COPY statement to take care of escaping and call it with a loop from bash, logging to a file.
1 2 3 4
22 seconds is chosen to avoid dividing into 1 minutes or 5 minutes. Monitoring and osm2pgsql updates often happen at those intervals and this avoids sampling during every single update or during none of them.
Lots of tools are used for log analysis, but because understanding these logs requires knowledge of PostgreSQL, we’ll use it for the analysis too. A suitable table for the data can be created and loaded with
1 2 3 4 5 6 7 8 9
Need help optimizing your PostGIS queries? I have experience finding slow parts of queries and improving them. If you want to hire me for some work, send me an email.