N062-H1 Tier 5 · Expert · hard ecommerce · Brightlane

Return each qualifying `category_name` and its `revenue`

Part of Choosing Between Subqueries, CTEs, and Joins in SQL

The problem

Scenario: Brightlane's product performance team is identifying which product categories are generating above-average revenue from line items.

Task: Write a query to return each qualifying category_name and its revenue.

Assumptions:

  • A line item's revenue is quantity multiplied by unit_price.
  • A category's revenue is the combined line-item revenue across its products.
  • The result covers only categories whose revenue is strictly greater than the average revenue across every category in the company-wide line-item set.

Output:

  • One row per qualifying category.
  • Columns in this order: category_name, revenue.
Schema · ecommerce 5 tables
categories
id integer
name text
parent_id? integer
products
id integer
name text
category_id integer
price numeric
stock_qty integer
attributes? jsonb
order_items
id integer
order_id integer
product_id integer
quantity integer
unit_price numeric
customers
id integer
name text
email text
city? text
country text
created_at timestamptz
is_active boolean
orders
id integer
customer_id integer
ordered_at timestamptz
status text
total_amount numeric

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
WITH
  category_revenue AS (
    SELECT
      p.category_id,
      SUM(oi.quantity * oi.unit_price) AS revenue
    FROM
      order_items oi
      JOIN products p ON oi.product_id = p.id
    GROUP BY
      p.category_id
  )
SELECT
  c.name AS category_name,
  cr.revenue
FROM
  category_revenue cr
  JOIN categories c ON c.id = cr.category_id
WHERE
  cr.revenue > (
    SELECT
      AVG(revenue)
    FROM
      category_revenue
  )

The shape

The CTE category_revenue computes one row of revenue per category, and the outer query references that same CTE twice — once as the driver and once inside a scalar subquery that takes AVG(revenue) across every category. Each category's revenue is then compared against that average to decide whether it qualifies. Referencing the CTE twice is exactly the case where naming the intermediate pays off.

Clause by clause

  • WITH category_revenue AS (SELECT p.category_id, SUM(oi.quantity * oi.unit_price) AS revenue FROM order_items oi JOIN products p ON oi.product_id = p.id GROUP BY p.category_id) joins line items to products and totals revenue per category. One row per category that has any line items across its products.
  • FROM category_revenue cr JOIN categories c ON c.id = cr.category_id drives the outer query off the aggregated CTE and joins to categories for the name.
  • WHERE cr.revenue > (SELECT AVG(revenue) FROM category_revenue) is the load-bearing filter. The scalar subquery reads the company-wide average revenue once across every category in the CTE; PostgreSQL evaluates it once and compares each outer row against that single number.
  • SELECT c.name AS category_name, cr.revenue returns only the qualifying categories' names and totals.

Why the CTE and not two separate aggregations

If category_revenue were inlined twice — one full pass for the outer driver and a second full pass inside the average subquery — the aggregation would run twice over order_items JOIN products. The CTE pulls the aggregation up once and lets both the outer driver and the average reference the same materialized intermediate. The structure also makes the intent legible: "compute revenue per category, then keep the ones above their own average."

The trap

The average has to be AVG(revenue) over the CTE, not AVG over the raw line items. Averaging line-item revenue gives the average line value, which has nothing to do with the per-category average. The denominator of any "above average" filter has to be averaged at the same grain as the value being compared. The CTE form keeps both at category grain, so the comparison stays honest. A second trap to watch for is > versus >= — the prompt says strictly greater than the average, which excludes any category whose revenue exactly equals the average; >= would include it.

You practiced computing per-category revenue in a CTE, then referring back to that same CTE in a scalar subquery for the cross-category average — a shape that needs the named layer because the same intermediate is used twice.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.