Slow Query with “LIMIT 1” in Postgresql

Recently I had issues on a production database running slow on tables with more than 1M rows.
This is a complex Ruby on Rails app connected to a Postgresql database, I went through the whole stack to figure out that it’s a simple index problem. Very few or no related doc was found about this issue, therefor I quickly post a summary of my adventure.

In Rails I often write something simple as  ActivityLog.first or Player.find(123).activity_logs.first , where the model ActivityLog have a default scope describe like so:

This leads to a SQL query looking like this:

This is probably one of the most common query in my Rails applications, and certainly not only in my applications.

Unfortunately, this may become a slow query on big tables. Indeed, if this query is often called your app may become from sticky to unusable (the latter was my case).

The following example illustrate the issue.
I recall it: this may be a very common query:

This query on my dev machine, which is way more powerful than our production VPS takes 1’124ms to run. If called only once per user’s click, this would immediately degrade your Apdex score.

As described in Postresql doc and by reading the EXPLAIN (ANALYZE, BUFFERS) output, a lot of disk access caused by the filter on player_id makes the query slow.

The numbers provided by BUFFERS help to identify which parts of the query are the most I/O-intensive.

The planner took the decision to use an index scan on created_at (the ORDER BY  attribute) and then filter player_id which in this situation isn’t the best choice. Disabling it with SET enable_indexscan = OFF;  improve drastically the performances, but it’s not a production solution.

The query execution time drop down to 0.05ms which is factor of 22’480 from the previous planner decision !

As described here another way to trick the planner is to add another field in the ORDER BY  clause.

Which improve the execution time in a comparable way. In a hurry this the solution a adopted, I have quickly patched my scope that way to improve globally the performances. It was safe and easy to commit with the advantage not to expand database indices size.

But for the sake of a better understanding I dug a bit more this issue. I have more than one big table, and I have more than one app online… I don’t like to keep such gray zone and smoking work-around in my code.

First, I tried to help the planner change its decision by tweaking statistics on every column, but it didn’t changed planner decision and didn’t improve the execution time.

Then I realized that simply building an index on player_id, my condition, and created_at, my ordering attribute, should help Postgersql extracting my query while using the actual planner decision.

…and the end result is pleasant enough. 0.03ms – my work is almost done.

On the Rails side a simple migration like the following change my user’s life, and restored my peace of mind.