Product Count per Category — GROUP BY in SQL

The problem

Brightlane's product team wants to see how catalogue items are distributed across categories.

Write a query to return each category_id alongside the number of products assigned to it.

Assumptions:

The products table contains every product in Brightlane's catalogue.
Some products have a recorded category_id; some have category_id set to NULL (unassigned, awaiting classification).
The output must include unassigned products as their own group — the row will have category_id of NULL and a count reflecting how many products are unassigned.

Output:

One row per distinct category_id (including one row where category_id is NULL), with columns category_id and product_count.

Schema · ecommerce 5 tables

The shape

GROUP BY category_id treats NULL as its own group. Every product with an unassigned category collapses into one bucket together, and COUNT(*) returns the number of unassigned products as a single row whose category_id value is NULL. The product team gets the unassigned count for free, without writing a separate query for it.

Clause by clause

SELECT category_id, COUNT(*) AS product_count returns the category identifier and the number of products in it. The output includes the NULL row because NULL is a valid grouping key for GROUP BY.
FROM products reads the catalogue.
GROUP BY category_id partitions the rows by category. Every row whose category_id is NULL ends up in the same NULL bucket. COUNT(*) then returns 3 for that bucket, alongside counts for the assigned categories.

Why this and not filtering NULLs out

Adding WHERE category_id IS NOT NULL would drop the unassigned products before the grouping ran, and the NULL row would disappear from the result. That is the wrong answer for this prompt: the product team explicitly wants the unassigned products visible as their own group, because that count tells them how much classification work is outstanding. Leaving the filter off is the deliberate choice that produces the row with category_id = NULL and product_count = 3.

The trap

GROUP BY treats NULL as one distinct group, but most other SQL machinery treats NULL as not-equal-to-anything-including-itself. WHERE category_id = NULL would match zero rows, not the three unassigned products, because equality against NULL is never true. The GROUP BY behavior is the exception, not the rule. Whenever a column has NULLs and you group on it, you will get a NULL row in the output, whether or not you wanted one. If a downstream consumer cannot handle a NULL key, that has to be addressed in the query (with a filter or a substitution), not assumed away.

You practiced relying on GROUP BY's NULL-as-its-own-group behavior. The recurring rule: GROUP BY treats NULL as a single distinct group — every NULL row collapses into one row in the output, which is convenient when you want unassigned items visible but inconvenient if you forget the NULL group will appear.

Return each `category_id` alongside the number of products assigned to it

The shape

Clause by clause

Why this and not filtering NULLs out

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.