Skip to content

Case 25: Don't Repeat Yourself: Reusing Calculations with CTEs

Of course, let's continue the series. This time, we'll look at a technique that makes complex queries easier to read and maintain. It can also improve performance by applying the "Don't Repeat Yourself" (DRY) principle.


The Scenario 📝

  • System: A financial analytics platform.
  • Table: A transactions table with 10,000,000 entries.
  • Columns: id, user_id, amount_usd, exchange_rate_eur, exchange_rate_vnd.
  • The Problem: We need to write a report that shows the value of each transaction in both EUR and VND. At the same time, we only want to retrieve transactions with a value greater than 25,000,000 VND.

The Problematic SQL (Repeating Calculation Logic)

A common way to write this query is to perform the calculation in both the SELECT clause and the WHERE clause.

sql
EXPLAIN ANALYZE
SELECT
    id,
    user_id,
    amount_usd * exchange_rate_eur AS amount_eur,
    amount_usd * exchange_rate_vnd AS amount_vnd -- First calculation
FROM
    transactions
WHERE
    (amount_usd * exchange_rate_vnd) > 25000000; -- Second calculation

Analyzing the Bottleneck 🧐

  1. Violates the DRY Principle: The logic for calculating amount_vnd (amount_usd * exchange_rate_vnd) is repeated in two places. If the formula changes later (for example, to add a fee), a developer must remember to change it in both places. Forgetting one will create a hard-to-find bug.
  2. Uncertain Performance: Although modern database planners are often smart enough to recognize this and only perform the calculation once, it's not guaranteed, especially in more complex queries. Repeating the logic might cause the planner to do the work twice unnecessarily.
  3. Hard to Read: The query becomes harder to read when complex expressions are repeated.

The Solution: Calculate Once with a CTE

  • The Logic: Use a Common Table Expression (CTE) or a subquery to perform all the calculations first. Then, the main query can simply work with the pre-calculated results.

  • The Optimized Query:

    sql
    EXPLAIN ANALYZE
    WITH transactions_with_local_currency AS (
        -- Step 1: Calculate all converted values just once.
        SELECT
            id,
            user_id,
            amount_usd * exchange_rate_eur AS amount_eur,
            amount_usd * exchange_rate_vnd AS amount_vnd
        FROM
            transactions
    )
    -- Step 2: Filter and select from the pre-calculated results.
    SELECT
        id,
        user_id,
        amount_eur,
        amount_vnd
    FROM
        transactions_with_local_currency
    WHERE
        amount_vnd > 25000000;

Analyzing the Results

  1. Clean, Maintainable Code: The calculation logic is now in one single place. If it needs to change, you only have to edit it inside the CTE. This reduces the risk of bugs.
  2. Guaranteed Single Calculation: This approach ensures that each calculation is performed only once per row. This is especially important if the calculation is more complex than a simple multiplication (e.g., calling a function or running another subquery).
  3. Clearer Execution Plan: You give the planner a clearer set of instructions: "First, calculate these values. Then, filter on those results." This helps create a more stable and predictable execution plan.

Conclusion:

  • Use CTEs or subqueries to follow the DRY principle, even in SQL.
  • Separating calculation logic from filtering logic not only makes your code easier to read and maintain but can also help the planner create a more efficient execution plan.
  • This is a good habit to adopt, especially as your calculation expressions become more complex.