Case 25: Don't Repeat Yourself: Reusing Calculations with CTEs
Of course, let's continue the series. This time, we'll look at a technique that makes complex queries easier to read and maintain. It can also improve performance by applying the "Don't Repeat Yourself" (DRY) principle.
The Scenario 📝
- System: A financial analytics platform.
- Table: A
transactionstable with 10,000,000 entries. - Columns:
id,user_id,amount_usd,exchange_rate_eur,exchange_rate_vnd. - The Problem: We need to write a report that shows the value of each transaction in both EUR and VND. At the same time, we only want to retrieve transactions with a value greater than 25,000,000 VND.
The Problematic SQL (Repeating Calculation Logic)
A common way to write this query is to perform the calculation in both the SELECT clause and the WHERE clause.
sql
EXPLAIN ANALYZE
SELECT
id,
user_id,
amount_usd * exchange_rate_eur AS amount_eur,
amount_usd * exchange_rate_vnd AS amount_vnd -- First calculation
FROM
transactions
WHERE
(amount_usd * exchange_rate_vnd) > 25000000; -- Second calculationAnalyzing the Bottleneck 🧐
- Violates the DRY Principle: The logic for calculating
amount_vnd(amount_usd * exchange_rate_vnd) is repeated in two places. If the formula changes later (for example, to add a fee), a developer must remember to change it in both places. Forgetting one will create a hard-to-find bug. - Uncertain Performance: Although modern database planners are often smart enough to recognize this and only perform the calculation once, it's not guaranteed, especially in more complex queries. Repeating the logic might cause the planner to do the work twice unnecessarily.
- Hard to Read: The query becomes harder to read when complex expressions are repeated.
The Solution: Calculate Once with a CTE ✅
The Logic: Use a Common Table Expression (CTE) or a subquery to perform all the calculations first. Then, the main query can simply work with the pre-calculated results.
The Optimized Query:
sqlEXPLAIN ANALYZE WITH transactions_with_local_currency AS ( -- Step 1: Calculate all converted values just once. SELECT id, user_id, amount_usd * exchange_rate_eur AS amount_eur, amount_usd * exchange_rate_vnd AS amount_vnd FROM transactions ) -- Step 2: Filter and select from the pre-calculated results. SELECT id, user_id, amount_eur, amount_vnd FROM transactions_with_local_currency WHERE amount_vnd > 25000000;
Analyzing the Results ✨
- Clean, Maintainable Code: The calculation logic is now in one single place. If it needs to change, you only have to edit it inside the CTE. This reduces the risk of bugs.
- Guaranteed Single Calculation: This approach ensures that each calculation is performed only once per row. This is especially important if the calculation is more complex than a simple multiplication (e.g., calling a function or running another subquery).
- Clearer Execution Plan: You give the planner a clearer set of instructions: "First, calculate these values. Then, filter on those results." This helps create a more stable and predictable execution plan.
Conclusion:
- Use CTEs or subqueries to follow the DRY principle, even in SQL.
- Separating calculation logic from filtering logic not only makes your code easier to read and maintain but can also help the planner create a more efficient execution plan.
- This is a good habit to adopt, especially as your calculation expressions become more complex.