Skip to content

Case 11: Data Type Mismatch Disables Index Usage

The Story

  • Table 1: users, a modern table with 2,000,000 users. The id column is a BIGINT (a large number) and is the primary key (which is indexed).
  • Table 2: user_actions_legacy, a very old and large table with 80,000,000 rows. Due to a mistake in the past, the user_id column here was stored as a VARCHAR(255) (a string of text). This column also has an index on it.
  • The Problem: We need to get the 10 most recent actions for a specific user, which requires joining these two tables.

The Problematic SQL Query

The JOIN statement looks completely normal.

sql
EXPLAIN ANALYZE
SELECT
    u.name,
    e.action,
    e.created_at
FROM users u
JOIN user_actions_legacy e ON u.id = e.user_id
WHERE u.id = 987654
ORDER BY e.created_at DESC
LIMIT 10;

The Query Plan (BEFORE a fix)

You would expect the database to find the user with id = 987654 and then use the index on user_actions_legacy to find their actions. But here is what really happens:

                                                          QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1450000.85..145000.87 rows=10 width=55) (actual time=9850.12..9850.15 rows=10 loops=1)
   ->  Sort  (cost=1450000.85..145000.90 rows=20 width=55) (actual time=9850.10..9850.12 rows=10 loops=1)
         Sort Key: e.created_at DESC
         ->  Nested Loop  (cost=0.43..1450000.15 rows=20 width=55) (actual time=0.05..9845.50 rows=40 loops=1)
               ->  Index Scan using users_pkey on users u  (cost=0.43..8.45 rows=1 width=25) (actual time=0.02..0.03 rows=1 loops=1)
                     Index Cond: (id = 987654)
               ->  Seq Scan on user_actions_legacy e  (cost=0.00..1449991.50 rows=20 width=38) (actual time=0.02..9845.30 rows=40 loops=1)
                     Filter: (987654 = (user_id)::bigint)
 Planning Time: 0.28 ms
 Execution Time: 9850.25 ms

The Bottleneck

  1. Look at this: The Seq Scan on user_actions_legacy line, and especially the Filter: (987654 = (user_id)::bigint) part.

  2. The Core Problem:

    • To compare u.id (a BIGINT) with e.user_id (a VARCHAR), PostgreSQL must convert one of them to match the other's type. Following its data type precedence rules, it chooses the safer option: it converts the VARCHAR to a BIGINT.
    • It applied a casting function (::bigint) to the e.user_id column for every single row in the user_actions_legacy table.
    • As we learned in Case 5, applying a function to an indexed column completely disables that index.
  3. The Consequence: Instead of using the index for a fast lookup, PostgreSQL was forced to perform a Seq Scan—reading all 80 million rows of the user_actions_legacy table. A performance disaster.

The Solution

  • Method 1: Fix the Schema (The best long-term solution)

    • Change the data type of the user_actions_legacy.user_id column to BIGINT so it matches users.id.
    • ALTER TABLE user_actions_legacy ALTER COLUMN user_id TYPE BIGINT USING (user_id::bigint);
    • Warning: This is the correct fix, but it can be a risky and slow operation on a live production system and requires careful planning.
  • Method 2: Fix the Query (The immediate workaround)

    • If you can't change the schema, you have to "guide" the planner by casting in the other direction: cast the single value, not the column from the giant table.
    sql
    EXPLAIN ANALYZE
    SELECT
        u.name,
        e.action,
        e.created_at
    FROM users u
    -- Cast u.id (just one value) to TEXT to match the type of e.user_id
    JOIN user_actions_legacy e ON u.id::text = e.user_id
    WHERE u.id = 987654
    ORDER BY e.created_at DESC
    LIMIT 10;

    Now, the comparison is between two text strings, and the index on the e.user_id (VARCHAR) column can be used.

The Query Plan (AFTER the fix - with Method 2)

                                                          QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10.50..10.52 rows=10 width=55) (actual time=0.08..0.10 rows=10 loops=1)
   ->  Sort  (cost=10.50..10.55 rows=20 width=55) (actual time=0.07..0.08 rows=10 loops=1)
         Sort Key: e.created_at DESC
         ->  Nested Loop  (cost=0.86..10.05 rows=20 width=55) (actual time=0.04..0.06 rows=40 loops=1)
               ->  Index Scan using users_pkey on users u  (cost=0.43..8.45 rows=1 width=25) (actual time=0.01..0.01 rows=1 loops=1)
                     Index Cond: (id = 987654)
               ->  Index Scan using idx_user_actions_legacy_user_id on user_actions_legacy e  (cost=0.43..1.55 rows=20 width=38) (actual time=0.02..0.04 rows=40 loops=1)
                     Index Cond: (user_id = (987654)::text)
 Planning Time: 0.35 ms
 Execution Time: 0.15 ms

Analyzing the Result

  1. The Big Change: The Seq Scan on the 80-million-row table is gone, replaced by a super-fast Index Scan on idx_user_actions_legacy_user_id.
  2. Performance Comparison:
    • Execution Time: Dropped from 9850.25 ms (~9.8 seconds) to just 0.15 ms. An exponential improvement in performance.

Conclusion

  • Data type consistency between columns used in JOIN or WHERE clauses is critically important for performance.
  • An "invisible" mismatch can cause the database to silently cast a value and disable your index.
  • When you have a query that is unexpectedly slow and should be using an index, immediately check the data types of the columns involved.