Brightlane's product team wants to identify well-stocked categories. Products that have not been assigned to any category should not factor into the analysis.
Write a query to return the category ID and product count for every assigned category that contains more than five products.
Assumptions:
- The
productstable contains every product in the catalogue. - Some products have a missing
category_id(unassigned). These rows must be excluded before the per-category count runs — otherwise they would form a missing-category_idgroup that the report intends to leave out. - The threshold (
> 5) applies to the per-category product count, after the unassigned rows are removed.
Output:
- One row per qualifying category, with columns
category_idandproduct_count.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
category_id,
product_count
FROM
(
SELECT
category_id,
COUNT(*) AS product_count
FROM
products
WHERE
category_id IS NOT NULL
GROUP BY
category_id
) AS category_counts
WHERE
product_count > 5 The shape
Two filters at two different layers do two different jobs. The inner WHERE drops unassigned products before the per-category count runs, so the missing-category_id group never forms. The outer WHERE then narrows the per-category counts to categories with more than five products.
Clause by clause
- The inner block computes a per-category product count, but only over categorised products:
SELECT category_id, COUNT(*) AS product_count
FROM products
WHERE category_id IS NOT NULL
GROUP BY category_idThe inner WHERE runs before GROUP BY. Rows where category_id is NULL are gone before grouping starts, so no NULL group lands in the derived table.
- FROM (...) AS category_counts materialises those per-category counts as a derived table.
- WHERE product_count > 5 is the outer filter. It operates on the per-category count, not on individual product rows. Five categories survive — 1, 4, 5, 11, 12 — each holding more than five products.
- SELECT category_id, product_count returns the two columns the product team needs for the well-stocked shortlist.
Why this and not put both filters at the same layer
The two filters target different things. WHERE category_id IS NOT NULL is a row filter — it cares about individual product rows and removes uncategorised ones. WHERE product_count > 5 is a group filter — it cares about per-category totals and removes thinly-populated categories.
Moving the row filter to the outer layer wouldn't work cleanly. By that point the rows are already grouped, and the missing-category_id group exists as one row with a NULL category and a real product count. Excluding it after the fact is possible but leaves a footprint: the count of unassigned products was computed and then thrown away. Filtering inside the inner query keeps the unassigned products out of the aggregation entirely, which matches the prompt's intent.
You practiced narrowing inside the derived table before aggregation. WHERE inside the inner query removes rows before the inner grouping runs; WHERE outside narrows the per-group results.