Brightlane's high-value customer pipeline materializes customers whose total spend exceeds the platform-wide average individual order amount into a temp table. The temp table feeds downstream segmentation analyses.
Write a query to return each qualifying customer's ID and total spend.
Assumptions:
- A customer's total spend is the combined
total_amountacross every order linked to thatcustomer_id. - The platform-wide average individual order amount is the average of
total_amountacross every order in the table. - Only customers whose total spend is greater than the platform-wide average individual order amount should appear.
Output:
- One row per qualifying customer, with columns
customer_idandtotal_spent.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
customer_id,
total_spent
FROM
(
SELECT
customer_id,
SUM(total_amount) AS total_spent
FROM
orders
GROUP BY
customer_id
) customer_totals
WHERE
total_spent > (
SELECT
AVG(total_amount)
FROM
orders
) The shape
A derived table computes each customer's total spend, and the outer WHERE compares those totals against a scalar subquery that returns the platform-wide average individual order amount. Two aggregations at different grain in one statement: the inner one is per-customer, the comparison value is across all orders. Both have to be computed and then compared at row level for each customer.
Clause by clause
(SELECT customer_id, SUM(total_amount) AS total_spent FROM orders GROUP BY customer_id) customer_totalsis the inner derived table. It groupsordersbycustomer_id, sums the order amounts in each group, and exposes a two-column row source with the per-customer totals.SELECT customer_id, total_spent FROM customer_totalsreads the derived table and returns both columns.WHERE total_spent > (SELECT AVG(total_amount) FROM orders)is the threshold. The innerSELECT AVG(total_amount) FROM ordersis an uncorrelated scalar subquery: it runs once, scans every order, and returns the single number 633.62865. The outerWHEREthen compares each customer'stotal_spentagainst that constant.
Why two different aggregations and not one
The two numbers being compared live at different grains. A customer's total spend is the sum across all their orders, computed inside the derived table. The platform-wide average is the average over individual orders across the whole company, computed by the scalar subquery against the raw orders table. Folding them into one GROUP BY would mean computing both at the same grain, which loses the meaning of either. Separating them — per-customer in one place, all-orders in the other — preserves what each number represents.
The trap
It is tempting to compare per-customer totals to the per-customer average, which would be a different filter entirely. AVG(total_amount) over the orders table is not the average customer's total spend. It is the average individual order amount, a smaller number. A customer with a single 700-dollar order would clear that bar; the per-customer average total would be much higher and might not. The threshold the pipeline materializes is the one against the individual order average, so the scalar subquery has to be SELECT AVG(total_amount) FROM orders directly, with no GROUP BY and no derived table in front. Reading "average" in the spec without checking what is being averaged is the silent mistake here.
You practiced comparing per-customer aggregates against a scalar subquery threshold — total-spend per customer above the average individual order amount; the kind of restricted result worth materializing once and querying repeatedly.