Brightlane's category team wants to identify all pairs of sibling categories — categories that share the same parent.
Write a query to return the name of each category in every sibling pair.
Assumptions:
- The
categoriestable contains every defined category in the catalogue. - Siblings share the same
parent_id. A category is not its own sibling, so the result must exclude pairs where both rows refer to the same category. - Each pair appears twice in the result — once with category A on the left and B on the right, once with B on the left and A on the right.
Output:
- One row per ordered sibling pair, with columns
category_nameandsibling_name.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
c1.name AS category_name,
c2.name AS sibling_name
FROM
categories c1
JOIN categories c2 ON c1.parent_id = c2.parent_id
WHERE
c1.id <> c2.id The shape
Two categories are siblings when they share a parent_id, so the join condition pairs rows on that shared attribute (c1.parent_id = c2.parent_id) rather than on the usual primary-key link. The inequality c1.id <> c2.id then strips out the self-pairs, where the same category matches itself.
Clause by clause
SELECT c1.name AS category_name, c2.name AS sibling_namereturns one name from each aliased instance. Each output row is an ordered pair of sibling names: Phones / Tablets, Phones / Laptops, and so on.FROM categories c1readscategoriesas the left side of the pair.JOIN categories c2 ON c1.parent_id = c2.parent_idjoins on the sharedparent_id. Every Electronics child matches every Electronics child, including itself; every Clothing child matches every Clothing child, including itself.WHERE c1.id <> c2.iddrops the diagonal — the rows where a category matched itself. Phones-and-Phones is gone, but Phones-and-Tablets and Tablets-and-Phones both remain, which is why each pair appears twice as ordered pairs in the result.
Why this and not a join on id
Most self-joins so far have used the primary-key form: e.manager_id = m.id or c.parent_id = p.id. That form expresses a directed relationship — one row points to another by storing its ID. Siblings are not a directed relationship. Neither sibling stores a pointer to the other; they only share a third value (parent_id). The join condition has to match on that shared value directly. The general rule: when the relationship is "share an attribute" rather than "reference each other," the join condition is c1.col = c2.col on the shared attribute, not a foreign-key-to-primary-key link.
The trap
Without WHERE c1.id <> c2.id, every category pairs with itself in the result — Phones-and-Phones, Tablets-and-Tablets, every row. The join condition c1.parent_id = c2.parent_id is reflexive: any row's parent_id equals its own parent_id, so each row matches itself before it matches any genuine sibling. The fix is the explicit inequality on the primary key, which excludes the diagonal of self-matches without disturbing the cross-pairs. Anytime a self-join's ON condition would be true when both aliases land on the same row, the WHERE clause needs an inequality to drop those self-pairs.
You practiced a self-join where the join condition is on a non-primary-key column shared by both aliases (parent_id), with an inequality on id to exclude self-pairs. The recurring shape any time "things that share an attribute but aren't the same thing" is the question.