Line Items and Revenue per Customer

The problem

Scenario: Brightlane's CRM team is profiling each customer's purchasing depth — both the number of line items they have purchased and the revenue those line items represent.

Task: Write a query to return each customer's customer_id, the total number of line items purchased across all their orders, and the total revenue across those line items.

Assumptions:

A line item's revenue is quantity multiplied by unit_price.
A customer's total_line_items is the combined count of line items across every order they have placed; their total_item_revenue is the combined revenue across the same set.
The result covers only customers who have placed at least one order containing line items.

Output:

One row per qualifying customer.
Columns in this order: customer_id, total_line_items, total_item_revenue.

Schema · ecommerce 5 tables

The shape

Both COUNT(oi.id) and SUM(oi.quantity * oi.unit_price) aggregate columns that live on order_items — the child side of the join — so the per-line-item row shape is the correct shape for both. Grouping by o.customer_id rolls those line-item values up through their parent orders to one row per customer in a single pass.

Clause by clause

SELECT o.customer_id, COUNT(oi.id) AS total_line_items, SUM(oi.quantity * oi.unit_price) AS total_item_revenue returns each customer's ID alongside two metrics on their line items. COUNT(oi.id) totals the line-item rows inside the customer group; SUM totals the per-line revenue values across those same rows.
FROM orders o reads the orders, carrying the customer_id that the customer-level grouping needs.
JOIN order_items oi ON oi.order_id = o.id pairs each order with its line items. Customers with multiple orders, and orders with multiple line items, both contribute their full row counts to the joined result.
GROUP BY o.customer_id collapses the joined rows down to one per customer. Both aggregates run inside each group against the same row set.

Why both metrics can share the same query

Because every aggregated column lives on order_items, the row-multiplied shape that the join produces is the right shape for both COUNT and SUM. Adding a third metric that came from orders — say, the number of distinct orders, or the sum of an order-level total — would require either COUNT(DISTINCT o.id) or a separate pre-aggregation pass, because those parent-side values would inflate by the line-item count under the current shape. The rule that lets these two metrics coexist is "the column being aggregated comes from the table that is fanning out."

The trap

The instinct that fails here is reaching for SUM(o.order_total) to get total revenue instead of computing it from line items. An order with three line items appears in the joined result three times, and its order_total would be counted three times — a customer with a single $500 order and three line items on it would report $1500 in revenue. The line-item revenue formula SUM(oi.quantity * oi.unit_price) avoids the trap entirely because every value being summed is unique to its row. Any time a query needs both a child-row count and a revenue total, the safest shape is the one that computes both from the child table itself; reach for parent-side aggregates only when the join shape is already one-to-one.

You practiced totaling row counts and revenue across the same child layer — a count and a sum drawn from one line-item set, rolled up to the customer.

Return each customer's `customer_id`, the total number of line items purchased across all their `orders`, and the total revenue across those line items

The shape

Clause by clause

Why both metrics can share the same query

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.