Case 5: Function in WHERE Clause Prevents Index Usage
The Story
- Table:
transactions, storing financial transactions, very large with 50,000,000 rows. - Columns:
id,amount,transaction_ts(typeTIMESTAMPTZ- stores timestamp with time zone),status. - Current State: A B-Tree
indexalready exists on thetransaction_tscolumn to speed up time-based queries. - The Problem: An accountant wants to see all transactions that occurred today, July 30, 2025. They write a query that uses
CASTto convert theTIMESTAMPTZto aDATE.
The Problematic SQL Query
EXPLAIN ANALYZE
SELECT id, amount, status
FROM transactions
WHERE CAST(transaction_ts AS date) = '2025-07-30';The Query Plan (BEFORE a fix)
Even with an index on transaction_ts, the execution plan will be:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Seq Scan on transactions (cost=0.00..1250000.00 rows=13700 width=24) (actual time=0.055..8750.123 rows=13688 loops=1)
Filter: (CAST(transaction_ts AS date) = '2025-07-30'::date)
Rows Removed by Filter: 49986312
Planning Time: 0.150 ms
Execution Time: 8755.850 msWhy is this a problem?
Look at this line:
Seq Scan on transactions. Once again, theindexwas completely ignored.Why wasn't the Index used?
- The index on the
transaction_tscolumn stores the raw, sortedTIMESTAMPTZvalues. - When you apply a
functionlikeCASTto thetransaction_tscolumn (CAST(transaction_ts AS date)), PostgreSQL can no longer "see" the original values in the index. It has to calculate the new value for every single row in the table, and only then can it compare the result with'2025-07-30'. - In other words, an index is only useful when you compare directly against the indexed column. Applying a function to that column makes the index useless for this query.
- The index on the
The Warning Signs:
Execution Time: **8755.850 ms**: Over 8.7 seconds, which is an unacceptable time.Rows Removed by Filter: **49986312**: It had to read almost 50 million rows and throw most of them away.
The Solution
There are two main ways to solve this problem.
Method 1: Rewrite the Query (Most Recommended) This is the best way. Instead of transforming the column, transform the value you are comparing against to create a range that the index can understand.
EXPLAIN ANALYZE
SELECT id, amount, status
FROM transactions
WHERE
transaction_ts >= '2025-07-30 00:00:00'
AND transaction_ts < '2025-07-31 00:00:00';This version doesn't change the transaction_ts column at all, so the index can be used effectively to scan a time range.
Method 2: Create a Function-Based Index This method is useful when you cannot change the query (for example, it comes from a third-party tool or a hard-to-modify ORM).
CREATE INDEX idx_transactions_ts_date ON transactions ((CAST(transaction_ts AS date)));This command creates a new index, not on the values of the transaction_ts column, but on the result of the function CAST(transaction_ts AS date). The original query can then use this new index.
The Query Plan (AFTER the fix - with Method 1)
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on transactions (cost=350.55..45870.15 rows=13700 width=24) (actual time=2.150..18.875 rows=13688 loops=1)
Heap Blocks: exact=11550
-> Bitmap Index Scan on idx_transactions_ts (cost=0.00..347.12 rows=13700 width=0) (actual time=1.550..1.550 rows=13688 loops=1)
Index Cond: ((transaction_ts >= '2025-07-30 00:00:00'::timestamp with time zone) AND (transaction_ts < '2025-07-31 00:00:00'::timestamp with time zone))
Planning Time: 0.180 ms
Execution Time: 19.543 msAnalyzing the Result
- The Big Change: The plan switched from a
Seq Scanto aBitmap Index Scan, using the originalidx_transactions_tsindex effectively. - Performance Comparison:
- Execution Time: Dropped from 8755.850 ms (~8.7 seconds) down to just 19.543 ms. That's about 450 times faster!
Conclusion
Avoid applying functions to indexed columns in your WHERE clause. Instead, transform the values on the other side of the comparison. This is one of the golden rules of SQL query optimization.