Products with Their Category Average Price

The problem

Scenario: Brightlane's merchandising team is benchmarking each product against its category's average price.

Task: Write a query to return each product_id and the average price of all products in that same category.

Assumptions:

The products table holds one row per product, with id identifying it, category_id recording its category, and price storing its current price.
The category_avg_price for a product is the average price across every product sharing the same category_id.
The result covers every product in the catalog.

Output:

One row per product.
Columns in this order: product_id, category_avg_price.

Schema · ecommerce 5 tables

categories

id integer

name text

parent_id? integer

products

id integer

name text

category_id integer

price numeric

stock_qty integer

attributes? jsonb

order_items

id integer

order_id integer

product_id integer

quantity integer

unit_price numeric

customers

id integer

name text

email text

city? text

country text

created_at timestamptz

is_active boolean

orders

id integer

customer_id integer

ordered_at timestamptz

status text

total_amount numeric

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  p.id AS product_id,
  cat_avg.avg_price AS category_avg_price
FROM
  products p
  JOIN (
    SELECT
      category_id,
      AVG(price) AS avg_price
    FROM
      products
    GROUP BY
      category_id
  ) AS cat_avg ON cat_avg.category_id = p.category_id

The shape

Every product's row needs a value computed across the whole category that product belongs to. A derived table computes the per-category average once per category, then the main query joins each product to its category's row to attach the benchmark.

Clause by clause

SELECT p.id AS product_id, cat_avg.avg_price AS category_avg_price returns the product's identifier and the per-category average pulled from the derived table.
FROM products p is the parent set — every catalog product appears once.
The derived table:

SELECT category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id

One row per category, with the category's average price already computed. - AS cat_avg ON cat_avg.category_id = p.category_id joins each product to its category's row. Every product has a category, and every category that contains products appears in the derived table, so the inner join loses nothing.

Why this and not a correlated subquery per row

The same numbers come out if the average is written as a correlated subquery in the SELECT list: SELECT p.id, (SELECT AVG(price) FROM products p2 WHERE p2.category_id = p.category_id) AS category_avg_price FROM products p. The result is identical. The cost is not: the correlated form re-runs the subquery once per outer row, computing the same category average over and over. The derived-table form computes each category's average exactly once and broadcasts it to every product in that category through the join. On a large catalog this is the structural difference between a quadratic and a linear pass over the data.

The trap

AVG(price) averages over the rows actually in the group. Products with a null price would be excluded from the average automatically — AVG ignores nulls. That is the standard behavior, and it is the right one here. The trap is assuming AVG divides by every row including nulls; it does not, so the average reflects only the priced rows.

You practiced precomputing per-category averages in a derived table before reattaching to each product — separating the per-category calculation from the per-product output.

Return each `product_id` and the average price of all `products` in that same category

The shape

Clause by clause

Why this and not a correlated subquery per row

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.