Skip to content

Case 19: Stop Guessing, Start Measuring with pg_stat_statements

Of course. This time, we won't optimize a specific query. Instead, we'll learn how to use a very powerful PostgreSQL tool to find out exactly which queries are slowing down your system: the pg_stat_statements extension.


The Story

  • System: A large web application has been running for a while. Recently, users have started complaining that the system "seems slower," but they can't point to a specific feature that is slow.
  • The Problem: The developers are "flying blind." They don't know where to start optimizing. They might guess that a few complex queries are the cause, but it's just a guess. Optimizing the wrong thing wastes time and doesn't solve the problem.

The Bottleneck

The bottleneck here isn't a query; it's a lack of visibility. We don't know:

  • Which queries are called the most?
  • Which queries consume the most total time?
  • Which queries have the highest average execution time?

Optimizing without data is like a doctor prescribing medicine without a diagnosis.

The Solution: Use pg_stat_statements for Diagnosis

  • The Logic: pg_stat_statements is a PostgreSQL extension that, when enabled, tracks and records performance statistics for all queries executed on the database.

  • It provides "hard data" so you can make optimization decisions based on evidence, not feelings.

  • How to Use It:

    1. Enable the Extension (usually requires superuser rights and is done only once):

      sql
      -- Add this to your postgresql.conf file and restart the server
      -- shared_preload_libraries = 'pg_stat_statements'
      
      -- After restarting, run this command in your database
      CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
    2. Let the System Run: Allow pg_stat_statements to collect enough data from real user traffic.

    3. Query the "Diagnosis Table": After a while, you can run the following query to see a "leaderboard" of the most expensive queries.

    sql
    SELECT
        (total_exec_time / 1000 / 60) as total_minutes, -- Total time in minutes
        (mean_exec_time)::numeric(10,2) as avg_ms, -- Average time in milliseconds
        calls, -- Number of times called
        query -- The normalized query text
    FROM
        pg_stat_statements
    ORDER BY
        total_exec_time DESC -- Sort by the most time-consuming queries
    LIMIT 10;

Analyzing the Result

You will get a result table that looks something like this:

total_minutesavg_mscallsquery
120.50.1548,200,000SELECT * FROM users WHERE id = $1
95.2850.506,700SELECT ... FROM posts p JOIN users u ON ... WHERE p.status = $1
78.15500.20850SELECT ... FROM reports r JOIN ... WHERE r.date > $1 ...
............

From this table, you can draw incredibly valuable conclusions:

  • Query 1: Although its average time is very fast (0.15ms), it's called so many times (nearly 50 million) that it takes up the most total time. The problem might be with caching in the application layer.
  • Query 2: The average time is high (850ms). This is a prime candidate for optimization using the techniques we've learned (checking indexes, joins, etc.).
  • Query 3: The average time is extremely high (5.5 seconds). Even though it's not called often, it has a major impact every time it runs. This is the most critical bottleneck and should be prioritized for immediate optimization.

Conclusion:

  • Never optimize based on guesswork.
  • pg_stat_statements is an essential diagnostic tool that gives you a data-driven overview of what is actually happening in your database.
  • By analyzing its results, you can pinpoint the "hot spots" and focus your optimization efforts where they will truly make the biggest impact.