Brightlane's navigation team wants to identify products that belong to a leaf category — a category that is not itself the parent of any other category.
Write a query to return the name of every such product.
Assumptions:
- The
productstable contains every product in the catalogue. - The
categoriestable contains every defined category;parent_idis the parent reference. A category is a parent if itsidappears as some other category'sparent_id. - Some categories are top-level — their
parent_idis missing.
Output:
- One row per leaf-category product, with a single column
name.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
name
FROM
products
WHERE
category_id NOT IN (
SELECT
parent_id
FROM
categories
WHERE
parent_id IS NOT NULL
) The shape
A leaf category is one whose id does not appear as any other category's parent_id. So the products that belong to leaf categories are the ones whose category_id is not in the set of parent_id values drawn from categories. The IS NOT NULL inside the subquery is what makes the NOT IN test actually return rows. Strip it out and the result silently collapses to zero.
Clause by clause
SELECT name FROM productsreads every product in the catalogue. The filter keeps only the products whosecategory_idisn't a parent of some other category.SELECT parent_id FROM categories WHERE parent_id IS NOT NULLis the inner subquery. It collects every non-nullparent_idacross the categories table. A top-level category hasparent_id = NULL, so theIS NOT NULLfilter ensures it contributes nothing to the set. The result is a clean set of category ids referenced as a parent by at least one other category.WHERE category_id NOT IN (...)is the outer membership test, negated. For each product, PostgreSQL checks whether the product'scategory_idappears in the parent-id set. If it does, the product belongs to a branch category and the row drops. If it doesn't, the category is a leaf and the product name comes back.
Why this and not bare NOT IN
The natural first draft is WHERE category_id NOT IN (SELECT parent_id FROM categories), without the IS NOT NULL filter. It looks right and runs cleanly. It also returns zero rows, every time, no matter the data.
The reason is three-valued logic. Top-level categories have parent_id = NULL, so the inner subquery returns a set that includes NULL alongside the real parent ids. For each outer product, PostgreSQL has to evaluate category_id <> NULL as part of the NOT IN test, and comparisons with NULL produce NULL. The whole test collapses to NULL for every product, the WHERE treats NULL as falsy, and no row passes.
The trap
NOT IN is the silent-zero-rows operator. The query runs, returns an empty result, and there's no error to indicate that the NULL in the subquery is what killed it. Any time NOT IN runs against a subquery whose column can be nullable, pre-filter with IS NOT NULL, or switch the negation to NOT EXISTS. The defensive filter is cheap; the silent failure is expensive.
You practiced the defensive IS NOT NULL check inside a NOT IN subquery. The recurring rule: NOT IN propagates NULL from the inner result into the outer comparison; pre-removing the missing values is what keeps the outer condition truth-valued.