Brightlane's procurement team is reviewing pricing patterns for products that appear on high-value order lines.
Write a query to return the product ID and average unit price for every product whose mean line price exceeds $300.
Assumptions:
- The
order_itemstable contains one row per product per order. - The average is taken across all of the product's line-item
unit_pricevalues. - The threshold (
> $300) applies to the per-product average.
Output:
- One row per qualifying product, with columns
product_idandavg_unit_price.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
product_id,
AVG(unit_price) AS avg_unit_price
FROM
order_items
GROUP BY
product_id
HAVING
AVG(unit_price) > 300 The shape
GROUP BY product_id builds the per-product line set; AVG(unit_price) computes each product's mean line price; HAVING AVG(unit_price) > 300 keeps the ones above $300. Product 6 lands at 1999, product 9 at 1499, down to product 21 at 349.
Clause by clause
SELECT product_id, AVG(unit_price) AS avg_unit_pricereturns the grouping column with its per-product average price.AVGreads theunit_pricevalue from each of the product's lines and returns the mean.FROM order_itemsis the source set: one row per (order, product) line, each with its ownunit_priceon that order.GROUP BY product_idpartitions those lines into one group per product. After this clause, each row in the working set represents one product with its average line price attached.HAVING AVG(unit_price) > 300filters those product rows by the average. The aggregate expression is repeated rather than referenced by theavg_unit_pricealias, because aliases don't exist at the pointHAVINGruns.
Why this and not WHERE unit_price > 300
The two filters answer different questions. WHERE unit_price > 300 keeps only individual lines priced above $300 and then averages those. A product with three lines at $310, $320, and $50 would be reported with an average around $315, because the $50 line never reaches the aggregation. HAVING AVG(unit_price) > 300 keeps every line, computes the honest mean across all of them, and only then checks the threshold. The same product would correctly average $226.67 and drop out.
The trap
The trap is reaching for WHERE because unit_price looks like a row-level thing. It is a row-level column, but the threshold is on the per-product average, which is a group-level value. WHERE silently filters the inputs to the average rather than the average itself, and the answer comes back as a plausible-looking number. The rule: a condition on an aggregate goes in HAVING, even when the underlying column is plain row data.
You practiced filtering with an AVG-based threshold in HAVING. The same shape applies to any aggregate — SUM, COUNT, MIN, MAX, AVG — the only constraint is that the filtering expression must be an aggregate, not a raw column reference.