Brightlane's product team wants to identify high-price-point categories — those whose average product price exceeds $200.
Write a query to return the category ID and average price for every category meeting that threshold.
Assumptions:
- The
productstable has one row per product with acategory_idand aprice. - A category's average price is the average of every product's
pricelinked to thatcategory_id. - Only categories whose average price exceeds
$200should appear.
Output:
- One row per qualifying category, with columns
category_idandavg_price.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
category_prices AS (
SELECT
category_id,
AVG(price) AS avg_price
FROM
products
GROUP BY
category_id
)
SELECT
category_id,
avg_price
FROM
category_prices
WHERE
avg_price > 200 The shape
The aggregation and the threshold check live in two named layers. The WITH layer category_prices computes AVG(price) per category; the main query reads that result and applies the WHERE avg_price > 200 filter against the aggregate as a named column.
Clause by clause
- The
WITHclause definescategory_prices:
WITH category_prices AS (
SELECT category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id
)GROUP BY category_id partitions products by category, and AVG(price) collapses each partition to a single average price. Every category gets one row in the named layer, including the ones whose averages are below the threshold.
SELECT category_id, avg_price FROM category_prices WHERE avg_price > 200is the main query. It readscategory_pricesand keeps only the rows where the aggregate exceeds200. Categories 5, 6, and 7 survive with averages of782.33,1459, and882.33.
Why this and not a derived table in FROM
A derived table would put the aggregation inside the main query's FROM and apply the threshold in the same WHERE. Both shapes return identical results. The WITH version pulls the aggregation out, names it, and lets the main query read top to bottom in the order the work happens: compute the per-category average first, then keep the ones above the cutoff. The derived-table version nests the same logic inside a single statement and forces the reader to look inward to find the aggregation. For "find groups whose aggregate exceeds X," WITH is the cleaner spelling.
The trap
The filter has to run on the aggregate, not on individual product prices. Putting WHERE price > 200 inside the layer would keep only products priced above 200 and then average those, which is a different calculation entirely. The threshold belongs in the main query, against the named aggregate column avg_price, which only exists once the layer's grouping is complete.
You practiced computing a per-category average in a WITH layer and applying a threshold check in the main query — the canonical shape for 'find groups whose aggregate exceeds X'.