Average Delivered Value per Customer — Conditional Aggregation in SQL

The problem

Brightlane's finance team needs per-customer average order values, computed only against the customer's fulfilled orders.

Write a query to return the customer ID and the average delivered-order value for every customer who has any orders on file.

Assumptions:

The orders table contains every order Brightlane has processed; a delivered order has status = 'delivered'.
Every customer with any orders should appear in the result — even customers whose orders are all non-delivered.
For a customer with zero delivered orders, the average must be missing, not 0 — there is no delivered-order data to average for that customer.

Output:

One row per customer with any orders, with columns customer_id and avg_delivered_value. Customers with no delivered orders will have a missing value in the second column.

Schema · ecommerce 5 tables

The shape

AVG(CASE WHEN status = 'delivered' THEN total_amount END) computes a per-customer average over only the delivered orders in each group. The other orders contribute NULL from the CASE, and AVG excludes NULL from both the sum and the count. Customers with at least one delivered order get a real average; customers 22, 5, and 13 — whose orders are all non-delivered — get NULL, because there were no delivered amounts to average.

Clause by clause

customer_id is the grouping column. Every customer with any orders becomes one group, regardless of whether those orders are delivered.
AVG(CASE WHEN status = 'delivered' THEN total_amount END) AS avg_delivered_value evaluates the CASE once per row. Delivered rows contribute their total_amount to the input set for AVG; non-delivered rows fall through to NULL and are skipped. The result is the average of the delivered amounts within that customer's group — the customer's typical fulfilled-order value, ignoring their pending, shipped, and cancelled orders.
FROM orders GROUP BY customer_id partitions every order into per-customer groups before the aggregate runs.

Why this and not `WHERE status = 'delivered'`

Filtering with WHERE status = 'delivered' before the GROUP BY would produce the same average for customers who have delivered orders. But it would drop customers 22, 5, and 13 from the output entirely — WHERE removes their rows before grouping, so they form no group. The prompt is explicit that every customer with any orders has to appear.

Conditional aggregation keeps the full per-customer group intact and lets the CASE decide which rows feed AVG. The group still exists for a customer with zero delivered orders; the input set for AVG inside that group is just empty.

The trap

The trap is reading the NULL averages as a query bug. They are the structurally correct answer for "average of an empty set." Customer 22 has three orders, all non-delivered; there is no delivered-order total to average. 0 would be wrong — that would imply Brightlane delivered orders to customer 22 with an average value of zero dollars. NULL says correctly that there is no delivered-order data for that customer.

When a downstream report needs a defined number instead of NULL, the right place to convert it is at the report boundary, not inside the CASE. The aggregate's NULL carries real information about which customers have no fulfilled orders.

You practiced AVG(CASE) where some groups have no matching rows. When the conditional input is missing for every row in the group, the average is missing too — the right answer for 'no data'.

Return the customer ID and the average delivered-order value for every customer who has any orders on file

The shape

Clause by clause

Why this and not `WHERE status = 'delivered'`

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not WHERE status = 'delivered'

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not `WHERE status = 'delivered'`