Scenario: Brightlane's customer revenue report has two observed problems at once: some customer totals are several times the expected value (suggesting child-row multiplication is inflating the parent total), and some customers report a missing-value revenue even though their orders are on record. The diagnostic must produce per-customer totals where each line item contributes exactly once and customers with no orders are still preserved.
Task: Write a query to return each customer's id, name, and total_revenue — the combined line-item value across their orders, with each line item contributing its value exactly once to the customer's total, reported as a missing value for customers with no orders on record.
Assumptions:
- An order may have multiple line items.
- A line item's value is
quantitymultiplied byunit_price. - A customer's
total_revenueis the combined line-item value across every order they have placed, with each line item counted exactly once. Customers with no orders on record appear withtotal_revenuereported as a missing value. - The result covers every customer.
Output:
- One row per customer.
- Columns in this order:
customer_id,customer_name,total_revenue. - Sorted by
customer_nameascending.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
order_item_totals AS (
SELECT
order_id,
SUM(quantity * unit_price) AS item_total
FROM
order_items
GROUP BY
order_id
),
customer_revenue AS (
SELECT
o.customer_id,
SUM(oit.item_total) AS total_revenue
FROM
orders o
JOIN order_item_totals oit ON oit.order_id = o.id
GROUP BY
o.customer_id
)
SELECT
c.id AS customer_id,
c.name AS customer_name,
cr.total_revenue
FROM
customers c
LEFT JOIN customer_revenue cr ON cr.customer_id = c.id
ORDER BY
c.name The shape
Two CTEs in sequence. The first collapses line items to one row per order, defusing the order-to-line-item fanout at its source. The second sums those per-order totals up to one row per customer. Then a final left join from customers to the customer-revenue summary preserves customers with no orders. The architecture fixes the inflated totals and the dropped customers in a single pass — each layer addresses one of the two observed problems.
Clause by clause
WITH order_item_totals AS (SELECT order_id, SUM(quantity * unit_price) AS item_total FROM order_items GROUP BY order_id)collapses every order's line items into one summary row per order. This is the fanout fix. Before this layer, joiningordersdirectly toorder_itemswould have produced multiple rows per order, and summing any order-level column over that result would inflate by the per-order line-item count.customer_revenue AS (SELECT o.customer_id, SUM(oit.item_total) AS total_revenue FROM orders o JOIN order_item_totals oit ON oit.order_id = o.id GROUP BY o.customer_id)rolls the per-order totals up to per-customer totals. The join is inner because every order inordershas a matching row inorder_item_totals(the CTE was built from the same key space), and customers with no orders don't appear in this CTE at all — they'll be reintroduced by the final left join.SELECT c.id AS customer_id, c.name AS customer_name, cr.total_revenuereturns the customer-level columns plus the revenue figure.FROM customers c LEFT JOIN customer_revenue cr ON cr.customer_id = c.idkeeps every customer. Customers absent fromcustomer_revenue(because they had no orders) get NULL fortotal_revenue, which is the second of the two diagnostic signals.ORDER BY c.namesorts alphabetically. The reference shows Cole Wood, Dan Carter, Eva Martinez, and others surfacing withtotal_revenueofnull— the preserved no-order customers — while customers like Henry Brown show 3085.96, the de-fanned correct total.
The trap
Combining the two fixes into one query is where this problem bites. A naive shape — customers LEFT JOIN orders LEFT JOIN order_items with SUM(quantity * unit_price) grouped by customer — looks like it should work and reads as correct. It does not. The chain produces one row per (customer, order, line-item) triple. SUM(quantity * unit_price) over that result is right per group, but if the analyst had needed to sum any order-level column (say o.total_amount) the same chain would multiply that value by the per-order line-item count and inflate it silently. The discipline is to collapse the many side in a CTE before the next join sees it. Each one-to-many relationship in a chain has to be aggregated to one row per parent before the next layer composes onto it; otherwise the next aggregation runs over a multiplied set and the error compounds upward without a visible signal.
You practiced a fanout-safe rollup — collapsing the many side per order before pairing up to the customer, while keeping the left-join chain intact so customers with no orders still appear with a missing-value revenue.