Customers Beating the Average Order Value

The problem

Brightlane's analytics team wants to identify customers whose personal average order value exceeds the mean of all customers' individual averages — outperformers compared to the typical customer, not just the overall order-amount average.

Write a query to return the customer ID and personal average order value for every qualifying customer.

Assumptions:

A customer's personal average is the AVG(total_amount) over their own orders.
The threshold is the average of customers' personal averages — not AVG(total_amount) over the whole orders table. (A customer with one $1,000 order and a customer with twenty $500 orders contribute equally to this mean — one personal average each — even though their order counts differ wildly.)
A qualifying customer's personal average exceeds that across-customers mean.

Output:

One row per qualifying customer, with columns customer_id and avg_order_value.

Schema · ecommerce 5 tables

categories

id integer

name text

parent_id? integer

products

id integer

name text

category_id integer

price numeric

stock_qty integer

attributes? jsonb

order_items

id integer

order_id integer

product_id integer

quantity integer

unit_price numeric

customers

id integer

name text

email text

city? text

country text

created_at timestamptz

is_active boolean

orders

id integer

customer_id integer

ordered_at timestamptz

status text

total_amount numeric

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  customer_id,
  avg_order_value
FROM
  (
    SELECT
      customer_id,
      AVG(total_amount) AS avg_order_value
    FROM
      orders
    GROUP BY
      customer_id
  ) AS customer_avgs
WHERE
  avg_order_value > (
    SELECT
      AVG(cust_avg)
    FROM
      (
        SELECT
          AVG(total_amount) AS cust_avg
        FROM
          orders
        GROUP BY
          customer_id
      ) AS all_cust_avgs
  )

The shape

The outer derived table holds each customer's personal average. The threshold on the right side of WHERE is the average of those per-customer averages — which requires its own derived table, nested inside a scalar subquery. Two passes for the answer set, two passes for the threshold, and the layering keeps the two definitions of average from collapsing into one another.

Clause by clause

The outer derived table computes one row per customer with their personal average:

SELECT customer_id, AVG(total_amount) AS avg_order_value
FROM orders
GROUP BY customer_id

Materialised as customer_avgs — the row set the outer SELECT reads from. - The threshold is computed in a scalar subquery on the right of WHERE:

SELECT AVG(cust_avg) FROM (
  SELECT AVG(total_amount) AS cust_avg
  FROM orders
  GROUP BY customer_id
) AS all_cust_avgs

The inner block produces the same per-customer averages again, this time as all_cust_avgs, and the surrounding AVG(cust_avg) collapses them to a single number — the mean of the personal averages. - WHERE avg_order_value > (...) then compares each customer's personal average against that single threshold value. Twenty-nine customers clear the bar. - SELECT customer_id, avg_order_value returns each qualifying customer with their personal average. Customer 24 tops the list at 1274, customer 10 next at 999.

Why this and not `AVG(total_amount)` over the whole orders table

A customer with one $1,000 order contributes one personal average of 1000 to the across-customers mean. A customer with twenty $500 orders contributes one personal average of 500. The across-customers mean weights each customer equally regardless of volume. AVG(total_amount) over the whole orders table weights each order equally, so the high-volume customer dominates. The prompt's language about outperforming the typical customer locks in the first definition.

The trap

The trap is using (SELECT AVG(total_amount) FROM orders) as the threshold and assuming it's the same number. It isn't. That scalar subquery computes a different statistic — the mean order amount across all orders, with high-volume customers carrying disproportionate weight. A customer's personal average can sit above one threshold and below the other, so swapping the two definitions silently changes who qualifies.

The rule: when the threshold is the average of a per-group statistic, the threshold needs its own group-and-aggregate pass first, and a derived table is the scaffold that makes that pass nameable.

You practiced nested derived tables: a derived table for the per-customer averages, with a scalar-subquery threshold whose own subquery is another derived table for the average-of-averages. The recurring shape any time the answer requires aggregating an aggregate — the layering makes each pass over the data explicit.

Return the customer ID and personal average order value for every qualifying customer

The shape

Clause by clause

Why this and not `AVG(total_amount)` over the whole orders table

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not AVG(total_amount) over the whole orders table

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not `AVG(total_amount)` over the whole orders table