Brightlane's customer analysis pipeline materializes each customer's order count and total spend alongside the platform-wide average individual order amount for benchmarking. The temp table feeds multiple comparative reports in the same session.
Write a query to return each customer's ID, order count, total spend, and the average individual order amount across every order.
Assumptions:
- A customer's order count is the number of orders linked to that
customer_id. A customer's total spend is the combinedtotal_amountacross those orders. - The platform-wide average is the average of
total_amountacross every order in the table. The same value should appear on every output row. - Every customer with at least one order should appear once.
Output:
- One row per customer, with columns
customer_id,order_count,total_spent, andoverall_avg.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(total_amount) AS total_spent,
(
SELECT
AVG(total_amount)
FROM
orders
) AS overall_avg
FROM
orders
GROUP BY
customer_id The shape
The per-customer aggregation runs in the main SELECT, and a scalar subquery in the SELECT list provides the platform-wide average. The scalar subquery returns exactly one value, which is then broadcast onto every output row by the engine. So each customer's row carries its own metrics alongside a benchmark that is identical across the whole result.
Clause by clause
SELECT customer_id, COUNT(*) AS order_count, SUM(total_amount) AS total_spentreturns the three per-customer columns.GROUP BY customer_idfurther down makes these one-row-per-customer aggregates.(SELECT AVG(total_amount) FROM orders) AS overall_avgis the scalar subquery. It runs independently of the outer grouping, scans the entireorderstable once, and returns the single number 633.62865. Because the subquery returns one row with one column, PostgreSQL treats it as a constant expression in the outer SELECT and the same value lands on every output row.FROM ordersreads the orders table for the outer aggregation.GROUP BY customer_idpartitions the rows by customer so theCOUNT(*)andSUM(total_amount)are per-customer.
Why a scalar subquery and not a separate query
Producing the benchmark inline keeps the materialization a single CTAS body. The downstream reports want the per-customer metrics and the platform average side by side; if the benchmark were computed in a separate query, the temp table would have to be assembled in two steps. The scalar subquery delivers the constant in one statement, and the result is a single relation already in the shape the reports need.
The trap
The scalar subquery looks like it should run once per row because it sits inside the SELECT list, but it does not depend on any outer column. PostgreSQL recognises this and evaluates it once for the entire query, caches the value, and reuses it for every row. The pattern is safe and cheap precisely because the inner SELECT is uncorrelated.
You practiced pairing a per-customer aggregation with a scalar subquery for a benchmark value — every row sees its own per-customer metrics plus the same platform-wide average.