Scenario: Brightlane's customer revenue pipeline composes line-item totals through orders up to customers. An analyst suspects some customers report a missing-value revenue because the chain is excluding customers with no orders. The diagnostic surfaces every customer's revenue alongside an explicit missing-value indicator for those with no orders on record.
Task: Write a query to return each customer's id, name, and total_revenue — the combined line-item value across all of their orders, reported as a missing value for customers who have no orders on record.
Assumptions:
- A line item's value is
quantitymultiplied byunit_price. - A customer's
total_revenueis the combined line-item value across every order they have placed, reported as a missing value when they have no orders on record. - The result covers every customer.
Output:
- One row per customer.
- Columns in this order:
customer_id,customer_name,total_revenue. - Sorted by
customer_nameascending.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
order_totals AS (
SELECT
order_id,
SUM(quantity * unit_price) AS item_total
FROM
order_items
GROUP BY
order_id
)
SELECT
c.id AS customer_id,
c.name AS customer_name,
SUM(ot.item_total) AS total_revenue
FROM
customers c
LEFT JOIN orders o ON o.customer_id = c.id
LEFT JOIN order_totals ot ON ot.order_id = o.id
GROUP BY
c.id,
c.name
ORDER BY
c.name The shape
A CTE that collapses line items to one row per order, then a left-join chain from customers outward through orders and into the order-total CTE. The chain stays left at every level, so a customer with no orders carries a NULL through to total_revenue instead of dropping out — the diagnostic signal that locates which layer introduced the gap.
Clause by clause
WITH order_totals AS (SELECT order_id, SUM(quantity * unit_price) AS item_total FROM order_items GROUP BY order_id)aggregates line items per order. One row per order, with the combined line-item value. This collapses the many side of the order-to-line-item relationship before any join to the customer side runs.SELECT c.id AS customer_id, c.name AS customer_name, SUM(ot.item_total) AS total_revenuereturns one row per customer.SUM(ot.item_total)totals the per-order subtotals; for a customer with no orders, everyot.item_totalin the group is NULL andSUMof all NULLs returns NULL.FROM customers c LEFT JOIN orders o ON o.customer_id = c.idkeeps every customer regardless of whether they have orders. A customer with no orders gets one row witho.idset to NULL.LEFT JOIN order_totals ot ON ot.order_id = o.idkeeps the customer row even wheno.idis NULL (the second left join propagates the preserved row through the second layer). For customers with orders, this attaches each order's item total; for customers without,ot.item_totalis NULL.GROUP BY c.id, c.namecollapses back to one row per customer, withSUMrunning over the joined item totals.ORDER BY c.namesorts alphabetically. The reference shows Cole Wood and several others withtotal_revenueofnull— those are the customers with no orders, which is the exact diagnostic signal the prompt is checking for.
The trap
Switch any link in the chain to an INNER JOIN and the customers with no orders silently disappear from the result entirely. The output looks reasonable — a list of customers with revenue totals — but the rows the diagnostic was designed to surface are gone. The whole point of the left-join chain is to keep the parent on screen so the missing-value column tells the story; converting any join in the chain to inner removes that signal without raising an error. When a downstream filter or join could plausibly drop the row the diagnostic is hunting for, every join from the driving table outward has to stay LEFT JOIN for the full chain to remain leak-proof.
You practiced a left-join chain that preserves the parent at every level — customers with no orders surface with a missing-value revenue rather than dropping out, the diagnostic signal that locates the gap-introducing layer.