Skip to content

Case 6: The Problem with Deep Pagination and OFFSET

The Story

  • Table: event_stream, a huge log of every user action, with 100,000,000 rows.
  • Columns: id, event_time (with an index), user_id, details.
  • The Problem: The app needs to show this history in pages. The simple way is to use OFFSET and LIMIT. But what happens when a user wants to see a page very deep in the history, like page 10,000?

The Problematic SQL Query

To get page 10,000, with 20 items per page, the OFFSET would be (10,000 - 1) * 20 = 199,800.

sql
EXPLAIN ANALYZE
SELECT id, event_time, details
FROM event_stream
ORDER BY event_time DESC
OFFSET 199800
LIMIT 20;

The Query Plan (BEFORE a fix)

                                                              QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=75000.50..75008.00 rows=20 width=50) (actual time=1350.75..1350.80 rows=20 loops=1)
   ->  Index Scan using idx_event_stream_event_time on event_stream  (cost=0.57..375000.00 rows=100000000 width=50) (actual time=0.035..1290.50 rows=199820 loops=1)
 Planning Time: 0.150 ms
 Execution Time: 1350.95 ms

Why is this a problem?

  1. Look at this line: rows=199820 in the Index Scan step.

  2. Why is OFFSET so slow?

    • You might think OFFSET 199800 tells the database to simply "jump" over the first 199,800 rows.
    • That's not what happens. The database can't just jump. It has to read every single one of the 199,820 rows from the index (199,800 for the offset + 20 for the limit), load them all into memory, and then throw away the first 199,800 rows to give you the final 20.
  3. The Warning Signs:

    • Execution Time: **1350.95 ms**: Over 1.3 seconds to load one page is far too slow.
    • This gets worse. If the user goes to page 20,000, the OFFSET doubles, and the query time will also roughly double. The performance gets worse the deeper you go.

The Solution: Keyset Pagination (The "Seek Method")

  • The Logic: Instead of telling the database to "skip N rows," we tell it, "get the next 20 rows after the last one I saw."

  • How to do it: We stop using OFFSET. Instead, we pass the event_time and id of the very last item from the previous page and use them in the WHERE clause.

  • The Optimized Query: Let's say the last item on the previous page had event_time = '2025-07-29 10:00:00' and id = 12345.

sql
EXPLAIN ANALYZE
SELECT id, event_time, details
FROM event_stream
-- Find rows that are "older" than the last one we saw
WHERE (event_time, id) < ('2025-07-29 10:00:00', 12345)
-- We order by id as well to handle cases where event_time is the same
ORDER BY event_time DESC, id DESC
LIMIT 20;

Note: For this to be super-fast, you need a composite index on (event_time, id).

The Query Plan (AFTER the fix)

                                                              QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..8.50 rows=20 width=50) (actual time=0.045..0.065 rows=20 loops=1)
   ->  Index Scan using idx_event_stream_event_time_id on event_stream  (cost=0.57..375000.00 rows=100000000 width=50) (actual time=0.040..0.055 rows=20 loops=1)
         Index Cond: ((event_time, id) < ('2025-07-29 10:00:00'::timestamp with time zone, 12345))
 Planning Time: 0.210 ms
 Execution Time: 0.085 ms

Analyzing the Result

  1. The Big Change: The Index Scan now only needs to read exactly 20 rows (rows=20). It uses the index to jump directly to the starting point and fetches the next 20.
  2. Performance Comparison:
    • Execution Time: Dropped from 1350.95 ms to just 0.085 ms. That's thousands of times faster.
    • Consistent Performance: Most importantly, this query will take the same amount of time whether you are on page 2 or page 20,000.

Conclusion

  • OFFSET is fine for the first few pages but is terrible for deep pagination in large datasets.
  • Keyset Pagination is the best solution for paginating large tables. It's perfect for "infinite scroll" interfaces, like those on social media feeds.
  • The Trade-off: With this method, you can't let users jump to a specific page number (like "Go to page 50"). You can only go to the "next" or "previous" page.