Categories with High Average Price — CTEs in SQL

The problem

Brightlane's product team wants to identify high-price-point categories — those whose average product price exceeds $200.

Write a query to return the category ID and average price for every category meeting that threshold.

Assumptions:

The products table has one row per product with a category_id and a price.
A category's average price is the average of every product's price linked to that category_id.
Only categories whose average price exceeds $200 should appear.

Output:

One row per qualifying category, with columns category_id and avg_price.

Schema · ecommerce 5 tables

categories

id integer

name text

parent_id? integer

products

id integer

name text

category_id integer

price numeric

stock_qty integer

attributes? jsonb

order_items

id integer

order_id integer

product_id integer

quantity integer

unit_price numeric

customers

id integer

name text

email text

city? text

country text

created_at timestamptz

is_active boolean

orders

id integer

customer_id integer

ordered_at timestamptz

status text

total_amount numeric

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

WITH
  category_prices AS (
    SELECT
      category_id,
      AVG(price) AS avg_price
    FROM
      products
    GROUP BY
      category_id
  )
SELECT
  category_id,
  avg_price
FROM
  category_prices
WHERE
  avg_price > 200

The shape

The aggregation and the threshold check live in two named layers. The WITH layer category_prices computes AVG(price) per category; the main query reads that result and applies the WHERE avg_price > 200 filter against the aggregate as a named column.

Clause by clause

The WITH clause defines category_prices:

WITH category_prices AS (
  SELECT category_id, AVG(price) AS avg_price
  FROM products
  GROUP BY category_id
)

GROUP BY category_id partitions products by category, and AVG(price) collapses each partition to a single average price. Every category gets one row in the named layer, including the ones whose averages are below the threshold.

SELECT category_id, avg_price FROM category_prices WHERE avg_price > 200 is the main query. It reads category_prices and keeps only the rows where the aggregate exceeds 200. Categories 5, 6, and 7 survive with averages of 782.33, 1459, and 882.33.

Why this and not a derived table in `FROM`

A derived table would put the aggregation inside the main query's FROM and apply the threshold in the same WHERE. Both shapes return identical results. The WITH version pulls the aggregation out, names it, and lets the main query read top to bottom in the order the work happens: compute the per-category average first, then keep the ones above the cutoff. The derived-table version nests the same logic inside a single statement and forces the reader to look inward to find the aggregation. For "find groups whose aggregate exceeds X," WITH is the cleaner spelling.

The trap

The filter has to run on the aggregate, not on individual product prices. Putting WHERE price > 200 inside the layer would keep only products priced above 200 and then average those, which is a different calculation entirely. The threshold belongs in the main query, against the named aggregate column avg_price, which only exists once the layer's grouping is complete.

You practiced computing a per-category average in a WITH layer and applying a threshold check in the main query — the canonical shape for 'find groups whose aggregate exceeds X'.

Return the category ID and average price for every category meeting that threshold

The shape

Clause by clause

Why this and not a derived table in `FROM`

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not a derived table in FROM

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not a derived table in `FROM`