Customers Above Average Spend — CREATE TABLE AS in SQL

The problem

Brightlane's high-value customer pipeline materializes customers whose total spend exceeds the platform-wide average individual order amount into a temp table. The temp table feeds downstream segmentation analyses.

Write a query to return each qualifying customer's ID and total spend.

Assumptions:

A customer's total spend is the combined total_amount across every order linked to that customer_id.
The platform-wide average individual order amount is the average of total_amount across every order in the table.
Only customers whose total spend is greater than the platform-wide average individual order amount should appear.

Output:

One row per qualifying customer, with columns customer_id and total_spent.

Schema · ecommerce 5 tables

The shape

A derived table computes each customer's total spend, and the outer WHERE compares those totals against a scalar subquery that returns the platform-wide average individual order amount. Two aggregations at different grain in one statement: the inner one is per-customer, the comparison value is across all orders. Both have to be computed and then compared at row level for each customer.

Clause by clause

(SELECT customer_id, SUM(total_amount) AS total_spent FROM orders GROUP BY customer_id) customer_totals is the inner derived table. It groups orders by customer_id, sums the order amounts in each group, and exposes a two-column row source with the per-customer totals.
SELECT customer_id, total_spent FROM customer_totals reads the derived table and returns both columns.
WHERE total_spent > (SELECT AVG(total_amount) FROM orders) is the threshold. The inner SELECT AVG(total_amount) FROM orders is an uncorrelated scalar subquery: it runs once, scans every order, and returns the single number 633.62865. The outer WHERE then compares each customer's total_spent against that constant.

Why two different aggregations and not one

The two numbers being compared live at different grains. A customer's total spend is the sum across all their orders, computed inside the derived table. The platform-wide average is the average over individual orders across the whole company, computed by the scalar subquery against the raw orders table. Folding them into one GROUP BY would mean computing both at the same grain, which loses the meaning of either. Separating them — per-customer in one place, all-orders in the other — preserves what each number represents.

The trap

It is tempting to compare per-customer totals to the per-customer average, which would be a different filter entirely. AVG(total_amount) over the orders table is not the average customer's total spend. It is the average individual order amount, a smaller number. A customer with a single 700-dollar order would clear that bar; the per-customer average total would be much higher and might not. The threshold the pipeline materializes is the one against the individual order average, so the scalar subquery has to be SELECT AVG(total_amount) FROM orders directly, with no GROUP BY and no derived table in front. Reading "average" in the spec without checking what is being averaged is the silent mistake here.

You practiced comparing per-customer aggregates against a scalar subquery threshold — total-spend per customer above the average individual order amount; the kind of restricted result worth materializing once and querying repeatedly.

Return each qualifying customer's ID and total spend

The shape

Clause by clause

Why two different aggregations and not one

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.