Categories with Average Price Above 100

The problem

Brightlane's buying team is evaluating premium-category positioning and needs to identify which product groupings command a higher average price point.

Write a query to return the category ID and average list price for every category whose mean price exceeds $100.

Assumptions:

The products table contains every product in Brightlane's catalogue.
The average is taken across the price of every product in the category.
Categories whose mean price is exactly $100 do not qualify; only those above qualify.

Output:

One row per qualifying category_id, with columns category_id and avg_price.

Schema · ecommerce 5 tables

The shape

GROUP BY category_id builds the per-category set of products; AVG(price) computes each category's mean list price; HAVING AVG(price) > 100 keeps only the categories above $100. The result is six surviving categories, ranging from category_id 6 at 1459 down to category_id 11 at 182.12. Categories whose mean price is at or below $100, or exactly $100, fall out — the comparison is strict.

Clause by clause

SELECT category_id, AVG(price) AS avg_price returns the grouping column with the per-category mean. AVG walks the price values for every product in the category and returns the average.
FROM products is the source set: every product in Brightlane's catalogue.
GROUP BY category_id partitions the catalogue into one group per category. After this clause, each row in the working set represents one category with its average price attached.
HAVING AVG(price) > 100 filters those category rows by the mean. The aggregate expression is repeated rather than referenced by the avg_price alias, because aliases are not in scope at HAVING-time.

Why this and not `WHERE price > 100`

This is the cleanest illustration of the HAVING vs WHERE split. WHERE price > 100 keeps only the individual products priced above $100 and averages those. A category with three products at $90, $110, and $130 would report an average of $120, because the $90 product is excluded before the average runs. HAVING AVG(price) > 100 keeps every product, computes the honest mean across all of them, and only then checks the threshold. That same category correctly averages $110.

Which form is correct depends on the question. "Categories whose mean product price exceeds $100" is the per-category metric, which needs HAVING. "The average price of products that cost more than $100" is the per-row filter, which needs WHERE. The two read similarly in English; in SQL they produce different numbers.

The trap

The trap is that both queries run cleanly and both return a result that looks like "average prices by category." There is no error to flag the wrong choice. The WHERE form silently changes which rows feed into the average, and the answer comes back as a plausible-looking number that is off by however much the excluded rows would have pulled the mean.

The rule: a threshold on an aggregate goes in HAVING. A threshold on the raw inputs to that aggregate goes in WHERE.

You practiced filtering categorical groupings by an aggregate threshold. The shape generalises to any "which segments meet this metric bar" question — the segment is the GROUP BY column, the metric is the aggregate, the bar is the HAVING comparison.

Return the category ID and average list price for every category whose mean price exceeds `$100`

The shape

Clause by clause

Why this and not `WHERE price > 100`

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not WHERE price > 100

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not `WHERE price > 100`