Brightlane's sales team wants to know which order statuses are performing above average — statuses whose mean order value exceeds the overall mean across all orders.
Write a query to return each qualifying status and its average order value.
Assumptions:
- The threshold (the overall average) is computed across every row in
orders, regardless of status. - Each status's average is computed across the rows that share that status.
- A qualifying status has a per-status average that exceeds the overall average.
Output:
- One row per qualifying status, with columns
statusandavg_order_value.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
status,
avg_order_value
FROM
(
SELECT
status,
AVG(total_amount) AS avg_order_value
FROM
orders
GROUP BY
status
) AS status_averages
WHERE
avg_order_value > (
SELECT
AVG(total_amount)
FROM
orders
) The shape
The derived table produces the per-status averages, and a scalar subquery on the right-hand side of the outer WHERE computes the overall average to compare against. Two aggregates over the same orders table, each used at a different layer to answer one question.
Clause by clause
- The inner block computes one row per status with its average order value:
SELECT status, AVG(total_amount) AS avg_order_value
FROM orders
GROUP BY statusThis is the row set the outer query reads from.
- FROM (...) AS status_averages materialises that result as a derived table.
- WHERE avg_order_value > (SELECT AVG(total_amount) FROM orders) compares each per-status average against the overall average. The scalar subquery in parentheses runs once over the whole orders table and produces a single value — the overall mean. Each row of the derived table is then tested against that single value.
- Two statuses qualify: delivered (648.02...) and shipped (644.41...), both above the overall mean.
- SELECT status, avg_order_value returns the surviving status name and its average. The qualifying-status list is short by design — only the above-average performers.
Why this and not put both averages in the inner query
The two averages aren't computed over the same row set. The per-status average is computed over one status's rows at a time; the overall average is computed over every order regardless of status. There's no single GROUP BY that produces both numbers in the same row.
The scalar subquery solves this by running on its own — independent of the outer query, independent of the derived table — and returning the overall average as a single value that any row of the derived table can compare against. Each aggregate gets the row set it needs.
You practiced combining a derived table with a scalar subquery in the same WHERE clause. The recurring shape: when the threshold itself is an aggregate over the same population, the scalar subquery produces it once and the derived table's outer filter compares against it.