N031-H2 Tier 3 · Intermediate · hard ecommerce · Brightlane

Return the category ID and average price for every premium, well-stocked category

Part of Chained CTEs in SQL

The problem

Brightlane's catalog team wants to find premium, well-stocked categories — categories that carry 3 or more products and whose average product price exceeds the average across every well-stocked category.

Write a query to return the category ID and average price for every premium, well-stocked category.

Assumptions:

  • A category's product count is the number of products records linked to that category_id. A category's average price is the average of price across those records.
  • A well-stocked category has 3 or more products. The well-stocked group consists of every well-stocked category.
  • The well-stocked-group average is the average of the per-category averages across every well-stocked category.
  • Only well-stocked categories whose average price exceeds the well-stocked-group average should appear.

Output:

  • One row per qualifying category, with columns category_id and avg_price.
Schema · ecommerce 5 tables
categories
id integer
name text
parent_id? integer
products
id integer
name text
category_id integer
price numeric
stock_qty integer
attributes? jsonb
order_items
id integer
order_id integer
product_id integer
quantity integer
unit_price numeric
customers
id integer
name text
email text
city? text
country text
created_at timestamptz
is_active boolean
orders
id integer
customer_id integer
ordered_at timestamptz
status text
total_amount numeric

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
WITH
  cat_stats AS (
    SELECT
      category_id,
      COUNT(*) AS product_count,
      AVG(price) AS avg_price
    FROM
      products
    GROUP BY
      category_id
  ),
  large_cats AS (
    SELECT
      category_id,
      product_count,
      avg_price
    FROM
      cat_stats
    WHERE
      product_count >= 3
  ),
  premium_large AS (
    SELECT
      category_id,
      avg_price
    FROM
      large_cats
    WHERE
      avg_price > (
        SELECT
          AVG(avg_price)
        FROM
          large_cats
      )
  )
SELECT
  category_id,
  avg_price
FROM
  premium_large

The shape

Three cascading layers. The first computes both per-category statistics, the second restricts to well-stocked categories, and the third compares each well-stocked category's average price against the average of those averages, computed over the restricted set. The third layer reads its source twice — row-by-row and through a scalar subquery — so the comparison value is the well-stocked-group average, not the cross-catalog average.

Clause by clause

The first CTE produces one statistics row per category:

WITH cat_stats AS (
  SELECT category_id, COUNT(*) AS product_count, AVG(price) AS avg_price
  FROM products
  GROUP BY category_id
)

GROUP BY category_id produces one row per category. COUNT(*) and AVG(price) are both computed in the same pass, since both are needed downstream: the count qualifies the category and the average is what the final comparison uses.

The second CTE keeps the well-stocked categories:

large_cats AS (
  SELECT category_id, product_count, avg_price
  FROM cat_stats
  WHERE product_count >= 3
)

WHERE product_count >= 3 drops the sparse categories. The SELECT list carries avg_price forward unchanged because the next layer needs it on both sides of the comparison.

The third CTE compares each remaining row's avg_price to the well-stocked-group average:

premium_large AS (
  SELECT category_id, avg_price
  FROM large_cats
  WHERE avg_price > (SELECT AVG(avg_price) FROM large_cats)
)

The scalar subquery (SELECT AVG(avg_price) FROM large_cats) reads the restricted layer and reduces it to a single number. WHERE avg_price > ... then compares each well-stocked category's average to that number. Same set on both sides of the comparison; both references see only well-stocked categories.

  • SELECT category_id, avg_price FROM premium_large returns the three premium well-stocked categories: 5, 6, and 7.

Why the scalar subquery reads large_cats and not cat_stats

The well-stocked-group average is the average of the well-stocked categories' averages, not the average of every category's average. Writing (SELECT AVG(avg_price) FROM cat_stats) would pull the sparse categories back into the comparison value, and a high-priced sparse category would shift the threshold. Reading from large_cats on both sides keeps the comparison anchored to the restricted set the prompt actually describes.

The trap

The restriction "well-stocked" lives in the second CTE, and it has to be respected by both references in the third. The row-by-row read FROM large_cats makes that automatic on the left side of the comparison, but the scalar subquery has to spell it out on the right by also reading from large_cats. Letting the scalar subquery slip back to cat_stats silently changes the threshold the surviving rows are measured against, and the change is invisible in the output until a borderline category flips one way or the other.

You practiced cascading three WITH stages where the second applies a count threshold and the third compares each remaining row's value against an aggregate of that same restricted set.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.