Brightlane's catalog team needs the lowest-priced product for each category. Products with no category on record form their own group — the lowest-priced uncategorized product should also appear in the result.
Write a query to return one row per category group, showing the category ID, ID of the lowest-priced product in that group, the product name, and the price. Sort the final result by category_id ascending.
Assumptions:
- The lowest-priced product in a category is the product with the smallest
pricefor thatcategory_id. When two products in the same category share the sameprice, the product with the smalleridwins. - Products with a missing
category_idform their own group; the lowest-priced product among that group appears in the result with a missingcategory_id. - The final result is sorted by
category_idascending; the missing-category_idrow appears at the end.
Output:
- One row per category group (each
category_idvalue plus the missing-category_idgroup), with columnscategory_id,product_id,name, andprice. Sorted bycategory_idascending; missing-category_idrow last.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT DISTINCT
ON (category_id) category_id,
id AS product_id,
name,
price
FROM
products
ORDER BY
category_id,
price,
id The shape
A missing category_id is treated like any other distinct value by DISTINCT ON. DISTINCT ON (category_id) keeps one row per distinct category_id, and the missing value is one of those distinct values, so the missing-category products form their own group and get their own row. ORDER BY category_id, price, id picks the cheapest product in each group, with the smaller id winning ties.
Clause by clause
SELECT DISTINCT ON (category_id) category_id, id AS product_id, name, pricereturns the four columns the catalog review needs.DISTINCT ON (category_id)declares one row per distinctcategory_idvalue, including the missing value as its own distinct group.FROM productsreads the product records.ORDER BY category_id, price, idsorts the products by three ascending keys. The leadingcategory_idascending satisfies theDISTINCT ONrequirement and gives the final result its category-ordered shape; PostgreSQL puts missing values last in an ascending sort by default, which is why the missing-category row appears at the end. The second key,priceascending, makes the cheapest product sit first in each category's group. The third key,idascending, breaks price ties: when two products in the same category share a price, the one with the smalleridsorts first and wins the per-category pick.
Why this and not ROW_NUMBER
The window-function form behaves the same way on missing values:
SELECT category_id, product_id, name, price
FROM (
SELECT category_id, id AS product_id, name, price,
ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY price, id) AS rn
FROM products
) ranked
WHERE rn = 1
ORDER BY category_idPARTITION BY category_id puts all missing-category rows into the same partition, exactly the way DISTINCT ON (category_id) puts them in the same group. Both forms return the missing-category row as part of the result.
The trap
The instinct on missing values is that they get filtered out or swept aside. With DISTINCT ON they do not. The missing value is its own distinct group, with its own kept row, sorted to wherever the ORDER BY puts it. If the product team had wanted to exclude uncategorized products, the query would have needed an explicit WHERE category_id IS NOT NULL before the per-category pick. Without that filter, the missing-category bucket is in the result by default. The corollary on the sort side: in an ascending ORDER BY, missing values sort last, which is why the uncategorized row appears at the end of the result rather than at the top.
You practiced DISTINCT ON (category_id) over a column that contains missing values — PostgreSQL treats every distinct value (including the missing value itself) as a separate group, so the missing-category_id records form their own bucket.