Brightlane's catalogue team is reviewing product groupings for quality control and needs to identify categories with meaningful product depth.
Write a query to return the category ID and product count for every category that contains at least three products.
Assumptions:
- The
productstable contains every product in the catalogue. - The threshold (
>= 3) applies to the per-category product count. - Some products have a missing
category_id; if there are at least three such products, the result will include a row whosecategory_idis missing.
Output:
- One row per qualifying group (including the missing-
category_idgroup if it qualifies), with columnscategory_idandproduct_count.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
category_id,
product_count
FROM
(
SELECT
category_id,
COUNT(*) AS product_count
FROM
products
GROUP BY
category_id
) AS category_counts
WHERE
product_count >= 3 The shape
The inner query groups every product row — including the ones with no category_id — and the outer query filters down to groups with at least three products. The NULL-category group survives the filter on the same terms as any other category and lands in the result with category_id shown as missing.
Clause by clause
- The inner block counts products per category, without any pre-filter:
SELECT category_id, COUNT(*) AS product_count
FROM products
GROUP BY category_idThis is the part that quietly produces the NULL group. GROUP BY collapses every product whose category_id is NULL into a single bucket — NULL here is treated as one distinct grouping key, not as "no group." That bucket has its own count.
- FROM (...) AS category_counts materialises the per-category counts.
- WHERE product_count >= 3 keeps every group with three or more products. The threshold is on the count column, which is never NULL here — every group has a real, positive count regardless of whether its category_id is missing.
- SELECT category_id, product_count returns each surviving group. The result includes ten rows; one of them carries category_id = NULL and product_count = 3, which represents three uncategorised products that, taken together, met the depth threshold.
The trap
GROUP BY treats NULL as a single grouping key. Every row with a NULL category collapses into one group, that group's count is computed alongside every other category's count, and the outer WHERE has no special handling for it. product_count >= 3 is true or false based on the count alone, with no awareness of whether the underlying category_id was real.
The consequence cuts both ways. If you want the NULL group out, you have to remove it explicitly — either with WHERE category_id IS NOT NULL inside the inner query (the M2 pattern), or with category_id IS NOT NULL added to the outer WHERE. If you don't, the NULL group shows up in the result like any other, and a report meant to list categories with three or more products silently includes a row whose category isn't a category at all.
You practiced an aggregate-then-filter pattern that includes the NULL group. The recurring rule: GROUP BY collapses all NULLs into one group — that group survives the outer filter on the same terms as any other category, which is sometimes desired and sometimes a surprise.