Case 11: Data Type Mismatch Disables Index Usage
The Story
- Table 1:
users, a modern table with 2,000,000 users. Theidcolumn is aBIGINT(a large number) and is the primary key (which is indexed). - Table 2:
user_actions_legacy, a very old and large table with 80,000,000 rows. Due to a mistake in the past, theuser_idcolumn here was stored as aVARCHAR(255)(a string of text). This column also has an index on it. - The Problem: We need to get the 10 most recent actions for a specific user, which requires joining these two tables.
The Problematic SQL Query
The JOIN statement looks completely normal.
sql
EXPLAIN ANALYZE
SELECT
u.name,
e.action,
e.created_at
FROM users u
JOIN user_actions_legacy e ON u.id = e.user_id
WHERE u.id = 987654
ORDER BY e.created_at DESC
LIMIT 10;The Query Plan (BEFORE a fix)
You would expect the database to find the user with id = 987654 and then use the index on user_actions_legacy to find their actions. But here is what really happens:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1450000.85..145000.87 rows=10 width=55) (actual time=9850.12..9850.15 rows=10 loops=1)
-> Sort (cost=1450000.85..145000.90 rows=20 width=55) (actual time=9850.10..9850.12 rows=10 loops=1)
Sort Key: e.created_at DESC
-> Nested Loop (cost=0.43..1450000.15 rows=20 width=55) (actual time=0.05..9845.50 rows=40 loops=1)
-> Index Scan using users_pkey on users u (cost=0.43..8.45 rows=1 width=25) (actual time=0.02..0.03 rows=1 loops=1)
Index Cond: (id = 987654)
-> Seq Scan on user_actions_legacy e (cost=0.00..1449991.50 rows=20 width=38) (actual time=0.02..9845.30 rows=40 loops=1)
Filter: (987654 = (user_id)::bigint)
Planning Time: 0.28 ms
Execution Time: 9850.25 msThe Bottleneck
Look at this: The
Seq Scan on user_actions_legacyline, and especially theFilter: (987654 = (user_id)::bigint)part.The Core Problem:
- To compare
u.id(aBIGINT) withe.user_id(aVARCHAR), PostgreSQL must convert one of them to match the other's type. Following its data type precedence rules, it chooses the safer option: it converts theVARCHARto aBIGINT. - It applied a casting function (
::bigint) to thee.user_idcolumn for every single row in theuser_actions_legacytable. - As we learned in
Case 5, applying a function to an indexed column completely disables that index.
- To compare
The Consequence: Instead of using the index for a fast lookup, PostgreSQL was forced to perform a
Seq Scan—reading all 80 million rows of theuser_actions_legacytable. A performance disaster.
The Solution
Method 1: Fix the Schema (The best long-term solution)
- Change the data type of the
user_actions_legacy.user_idcolumn toBIGINTso it matchesusers.id. ALTER TABLE user_actions_legacy ALTER COLUMN user_id TYPE BIGINT USING (user_id::bigint);- Warning: This is the correct fix, but it can be a risky and slow operation on a live production system and requires careful planning.
- Change the data type of the
Method 2: Fix the Query (The immediate workaround)
- If you can't change the schema, you have to "guide" the planner by casting in the other direction: cast the single value, not the column from the giant table.
sqlEXPLAIN ANALYZE SELECT u.name, e.action, e.created_at FROM users u -- Cast u.id (just one value) to TEXT to match the type of e.user_id JOIN user_actions_legacy e ON u.id::text = e.user_id WHERE u.id = 987654 ORDER BY e.created_at DESC LIMIT 10;Now, the comparison is between two text strings, and the index on the
e.user_id(VARCHAR) column can be used.
The Query Plan (AFTER the fix - with Method 2)
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10.50..10.52 rows=10 width=55) (actual time=0.08..0.10 rows=10 loops=1)
-> Sort (cost=10.50..10.55 rows=20 width=55) (actual time=0.07..0.08 rows=10 loops=1)
Sort Key: e.created_at DESC
-> Nested Loop (cost=0.86..10.05 rows=20 width=55) (actual time=0.04..0.06 rows=40 loops=1)
-> Index Scan using users_pkey on users u (cost=0.43..8.45 rows=1 width=25) (actual time=0.01..0.01 rows=1 loops=1)
Index Cond: (id = 987654)
-> Index Scan using idx_user_actions_legacy_user_id on user_actions_legacy e (cost=0.43..1.55 rows=20 width=38) (actual time=0.02..0.04 rows=40 loops=1)
Index Cond: (user_id = (987654)::text)
Planning Time: 0.35 ms
Execution Time: 0.15 msAnalyzing the Result
- The Big Change: The
Seq Scanon the 80-million-row table is gone, replaced by a super-fastIndex Scanonidx_user_actions_legacy_user_id. - Performance Comparison:
- Execution Time: Dropped from 9850.25 ms (~9.8 seconds) to just 0.15 ms. An exponential improvement in performance.
Conclusion
- Data type consistency between columns used in
JOINorWHEREclauses is critically important for performance. - An "invisible" mismatch can cause the database to silently cast a value and disable your index.
- When you have a query that is unexpectedly slow and should be using an index, immediately check the data types of the columns involved.