N066-H1 Tier 5 · Expert · hard ecommerce · Brightlane

Return each customer's `id`, name, and `total_revenue` — the combined line-item value across their `orders`, with each line item contributing its value exactly once to the customer's total, reported as a missing value for customers with no orders on record

Part of Analyst Debugging Patterns in SQL

The problem

Scenario: Brightlane's customer revenue report has two observed problems at once: some customer totals are several times the expected value (suggesting child-row multiplication is inflating the parent total), and some customers report a missing-value revenue even though their orders are on record. The diagnostic must produce per-customer totals where each line item contributes exactly once and customers with no orders are still preserved.

Task: Write a query to return each customer's id, name, and total_revenue — the combined line-item value across their orders, with each line item contributing its value exactly once to the customer's total, reported as a missing value for customers with no orders on record.

Assumptions:

  • An order may have multiple line items.
  • A line item's value is quantity multiplied by unit_price.
  • A customer's total_revenue is the combined line-item value across every order they have placed, with each line item counted exactly once. Customers with no orders on record appear with total_revenue reported as a missing value.
  • The result covers every customer.

Output:

  • One row per customer.
  • Columns in this order: customer_id, customer_name, total_revenue.
  • Sorted by customer_name ascending.
Schema · ecommerce 5 tables
categories
id integer
name text
parent_id? integer
products
id integer
name text
category_id integer
price numeric
stock_qty integer
attributes? jsonb
order_items
id integer
order_id integer
product_id integer
quantity integer
unit_price numeric
customers
id integer
name text
email text
city? text
country text
created_at timestamptz
is_active boolean
orders
id integer
customer_id integer
ordered_at timestamptz
status text
total_amount numeric

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
WITH
  order_item_totals AS (
    SELECT
      order_id,
      SUM(quantity * unit_price) AS item_total
    FROM
      order_items
    GROUP BY
      order_id
  ),
  customer_revenue AS (
    SELECT
      o.customer_id,
      SUM(oit.item_total) AS total_revenue
    FROM
      orders o
      JOIN order_item_totals oit ON oit.order_id = o.id
    GROUP BY
      o.customer_id
  )
SELECT
  c.id AS customer_id,
  c.name AS customer_name,
  cr.total_revenue
FROM
  customers c
  LEFT JOIN customer_revenue cr ON cr.customer_id = c.id
ORDER BY
  c.name

The shape

Two CTEs in sequence. The first collapses line items to one row per order, defusing the order-to-line-item fanout at its source. The second sums those per-order totals up to one row per customer. Then a final left join from customers to the customer-revenue summary preserves customers with no orders. The architecture fixes the inflated totals and the dropped customers in a single pass — each layer addresses one of the two observed problems.

Clause by clause

  • WITH order_item_totals AS (SELECT order_id, SUM(quantity * unit_price) AS item_total FROM order_items GROUP BY order_id) collapses every order's line items into one summary row per order. This is the fanout fix. Before this layer, joining orders directly to order_items would have produced multiple rows per order, and summing any order-level column over that result would inflate by the per-order line-item count.
  • customer_revenue AS (SELECT o.customer_id, SUM(oit.item_total) AS total_revenue FROM orders o JOIN order_item_totals oit ON oit.order_id = o.id GROUP BY o.customer_id) rolls the per-order totals up to per-customer totals. The join is inner because every order in orders has a matching row in order_item_totals (the CTE was built from the same key space), and customers with no orders don't appear in this CTE at all — they'll be reintroduced by the final left join.
  • SELECT c.id AS customer_id, c.name AS customer_name, cr.total_revenue returns the customer-level columns plus the revenue figure.
  • FROM customers c LEFT JOIN customer_revenue cr ON cr.customer_id = c.id keeps every customer. Customers absent from customer_revenue (because they had no orders) get NULL for total_revenue, which is the second of the two diagnostic signals.
  • ORDER BY c.name sorts alphabetically. The reference shows Cole Wood, Dan Carter, Eva Martinez, and others surfacing with total_revenue of null — the preserved no-order customers — while customers like Henry Brown show 3085.96, the de-fanned correct total.

The trap

Combining the two fixes into one query is where this problem bites. A naive shape — customers LEFT JOIN orders LEFT JOIN order_items with SUM(quantity * unit_price) grouped by customer — looks like it should work and reads as correct. It does not. The chain produces one row per (customer, order, line-item) triple. SUM(quantity * unit_price) over that result is right per group, but if the analyst had needed to sum any order-level column (say o.total_amount) the same chain would multiply that value by the per-order line-item count and inflate it silently. The discipline is to collapse the many side in a CTE before the next join sees it. Each one-to-many relationship in a chain has to be aggregated to one row per parent before the next layer composes onto it; otherwise the next aggregation runs over a multiplied set and the error compounds upward without a visible signal.

You practiced a fanout-safe rollup — collapsing the many side per order before pairing up to the customer, while keeping the left-join chain intact so customers with no orders still appear with a missing-value revenue.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.