Of course! We'll continue with an optimization technique for complex analytical queries and reports, where recalculating data every time is too expensive: Materialized Views.
Case 16: Accelerating Complex Reports with a Materialized View
The Story
- System: A business intelligence dashboard for a large e-commerce platform.
- Tables:
orders(50 million rows)order_items(200 million rows)products(50,000 rows)categories(1,000 rows)
- The Problem: Managers constantly view a report that summarizes revenue, number of orders, and number of products sold for each category in the previous month. This query is very heavy because it has to
JOINmultiple large tables and perform several aggregate functions (SUM,COUNT DISTINCT).
The Problematic SQL Query (The Original Report)
sql
-- This query runs every time a user loads the dashboard
EXPLAIN ANALYZE
SELECT
c.name,
SUM(oi.price * oi.quantity) as total_revenue,
COUNT(DISTINCT o.id) as total_orders,
COUNT(DISTINCT oi.product_id) as unique_products_sold
FROM categories c
JOIN products p ON c.id = p.category_id
JOIN order_items oi ON p.id = oi.product_id
JOIN orders o ON oi.order_id = o.id
WHERE o.order_date >= '2025-06-01' AND o.order_date < '2025-07-01'
GROUP BY c.name
ORDER BY total_revenue DESC;The Bottleneck
- Inherently Expensive Calculation: Even with full indexing, the nature of this query requires reading and processing tens of millions of rows from the
ordersandorder_itemstables.COUNT(DISTINCT)operations are particularly resource-intensive. - Repetitive Execution: The biggest problem is that this heavy query is run over and over again every time a user loads the dashboard page. This creates a constant and massive load on the production database.
The Solution: Use a Materialized View ✅
The Logic: Instead of calculating this report from scratch every time it's requested, we will calculate it once and save the results into a physical, table-like object called a
Materialized View. Then, users will only need to query this small, pre-computed view.The Steps:
Create the
Materialized View:sql-- Run this once to create the structure and compute the initial data CREATE MATERIALIZED VIEW category_monthly_summary AS SELECT c.name AS category_name, SUM(oi.price * oi.quantity) as total_revenue, COUNT(DISTINCT o.id) as total_orders, COUNT(DISTINCT oi.product_id) as unique_products_sold FROM categories c -- ... (the rest of the heavy query) GROUP BY c.name;Refresh the Data Periodically: Schedule this command to run (e.g., once an hour or once every night) to update the view with new data.
sqlREFRESH MATERIALIZED VIEW category_monthly_summary;Rewrite the Application's Query: The query for the dashboard page now becomes incredibly simple and fast.
sql-- The new, super-fast query SELECT * FROM category_monthly_summary ORDER BY total_revenue DESC;
Analyzing the Result
- Instant Read Speed: The user's query is now just a simple
SELECTon a pre-calculated view (which only has 1,000 rows, one for each category). The response time is nearly instantaneous. - Reduced Database Load: The calculation burden is shifted from "on-demand" to "scheduled." The database only has to do the heavy lifting once per refresh cycle, instead of hundreds or thousands of times per hour.
The Trade-off
- Stale Data: This is the biggest trade-off. The data in the view is only as fresh as the last
REFRESH. If the view was refreshed at 2 AM, it won't include orders placed at 3 AM. This must be acceptable for the business requirements. REFRESHCost: The refresh process itself is resource-intensive and should be scheduled during off-peak hours.- Storage Cost: The materialized view takes up physical disk space to store its results.
Conclusion
- A
Materialized Viewis the ideal solution for complex, expensive, and frequently accessed reporting and analytical queries where the data does not need to be real-time down to the second. - It helps shift the computational load away from user requests, leading to extremely high-performance dashboards and reports.