Scenario: Brightlane's merchandising team needs each product category's name paired with the total revenue it has generated from line items.
Task: Write a query to return each category_name and its revenue — the combined line-item revenue across its products.
Assumptions:
- A line item's revenue is
quantitymultiplied byunit_price. - A category's
revenueis the combined line-item revenue across every product in that category. - The result covers only categories with at least one line item across their products.
Output:
- One row per qualifying category.
- Columns in this order:
category_name,revenue.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
category_revenue AS (
SELECT
p.category_id,
SUM(oi.quantity * oi.unit_price) AS revenue
FROM
order_items oi
JOIN products p ON oi.product_id = p.id
GROUP BY
p.category_id
)
SELECT
c.name AS category_name,
cr.revenue
FROM
categories c
JOIN category_revenue cr ON cr.category_id = c.id The shape
A CTE aggregates line-item revenue by category id once, and an inner JOIN to categories attaches the category name. The CTE makes the aggregation step its own named layer; the outer join handles the labeling step.
Clause by clause
WITH category_revenue AS (SELECT p.category_id, SUM(oi.quantity * oi.unit_price) AS revenue FROM order_items oi JOIN products p ON oi.product_id = p.id GROUP BY p.category_id)joins each line item to its product to reachcategory_id, then totals line-item revenue per category. One row per category that has at least one line item across its products.SELECT c.name AS category_name, cr.revenue FROM categories c JOIN category_revenue cr ON cr.category_id = c.idbrings the category name in. The innerJOINis deliberate — categories with no line items are excluded, which matches "covers only categories with at least one line item across their products."
Why pre-aggregate in a CTE and not aggregate in the final query
SELECT c.name, SUM(oi.quantity * oi.unit_price) FROM categories c JOIN products p ON p.category_id = c.id JOIN order_items oi ON oi.product_id = p.id GROUP BY c.name produces the same numbers. The shapes differ in how they handle row counts:
-- Pre-aggregate in CTE, then join for the label
WITH category_revenue AS (
SELECT p.category_id, SUM(oi.quantity * oi.unit_price) AS revenue
FROM order_items oi JOIN products p ON oi.product_id = p.id
GROUP BY p.category_id
)
SELECT c.name AS category_name, cr.revenue
FROM categories c JOIN category_revenue cr ON cr.category_id = c.id
-- Single flat query: join everything, then aggregate
SELECT c.name AS category_name, SUM(oi.quantity * oi.unit_price) AS revenue
FROM categories c JOIN products p ON p.category_id = c.id
JOIN order_items oi ON oi.product_id = p.id
GROUP BY c.nameIn the flat shape, the join produces one row per line item before the GROUP BY collapses it. The pre-aggregated shape produces one row per category before the final join, keeping the join's row count exactly equal to the number of qualifying categories. When the many side is wide, pre-aggregating is the cleaner separation between "compute the metric" and "label the result."
The trap
The aggregation has to happen by category_id, not by category name. If two categories share a name (rare but possible in a real catalog), grouping on name silently merges their revenue. Grouping on category_id first and joining for the label keeps each category distinct regardless of name collisions.
You practiced precomputing per-category revenue in a CTE before pairing it with the category lookup — separating the per-category calculation from the name attachment.