High Value Categories in Two Steps

The problem

Brightlane's catalog team wants to identify product categories with a high average price.

Write a query to return the category ID and average price for every category whose average product price exceeds $300.

Assumptions:

The products table has one row per product with a category_id and a price.
A category's average price is the average of every product's price linked to that category_id.
Only categories whose average price exceeds $300 should appear.

Output:

One row per qualifying category, with columns category_id and avg_price.

Schema · ecommerce 5 tables

categories

id integer

name text

parent_id? integer

products

id integer

name text

category_id integer

price numeric

stock_qty integer

attributes? jsonb

order_items

id integer

order_id integer

product_id integer

quantity integer

unit_price numeric

customers

id integer

name text

email text

city? text

country text

created_at timestamptz

is_active boolean

orders

id integer

customer_id integer

ordered_at timestamptz

status text

total_amount numeric

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

WITH
  category_stats AS (
    SELECT
      category_id,
      AVG(price) AS avg_price
    FROM
      products
    GROUP BY
      category_id
  ),
  high_value AS (
    SELECT
      category_id,
      avg_price
    FROM
      category_stats
    WHERE
      avg_price > 300
  )
SELECT
  category_id,
  avg_price
FROM
  high_value

The shape

Two named layers stacked in one WITH clause. The first reduces products to one row per category with its average price, and the second reads that result and keeps only the categories whose average exceeds 300. The threshold runs against the already-averaged rows, not against the raw product prices.

Clause by clause

The first CTE collapses the catalog into per-category averages:

WITH category_stats AS (
  SELECT category_id, AVG(price) AS avg_price
  FROM products
  GROUP BY category_id
)

GROUP BY category_id produces one row per category, and AVG(price) is the value computed inside each group. AS avg_price names the aggregate so the next layer can reference it by name.

The second CTE reads that result and keeps the qualifying categories:

high_value AS (
  SELECT category_id, avg_price
  FROM category_stats
  WHERE avg_price > 300
)

FROM category_stats treats the averaged result as the source table, and WHERE avg_price > 300 drops the categories at or below the threshold. The three categories that survive are 5, 6, and 7, with averages of 782.33, 1459, and 882.33.

SELECT category_id, avg_price FROM high_value returns the second layer unchanged. The main query is thin on purpose: each step of the work is named in its own CTE, and the final query just reads the last one.

You practiced layering two WITH stages — compute the per-category average in one named layer, then read from that layer in a second named layer that applies the threshold check.

Return the category ID and average price for every category whose average product price exceeds `$300`

The shape

Clause by clause

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.