Scenario: Brightlane's product analytics team needs every product that has appeared on at least one order line, paired with how many line items it has been included in.
Task: Write a query to return each product_id and its total line-item count.
Assumptions:
- A product's
line_item_countis the count of line items recorded against it. - The result covers only products that appear on at least one line item.
Output:
- One row per qualifying product.
- Columns in this order:
product_id,line_item_count.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
p.id AS product_id,
item_counts.line_item_count
FROM
products p
JOIN (
SELECT
product_id,
COUNT(*) AS line_item_count
FROM
order_items
GROUP BY
product_id
) AS item_counts ON item_counts.product_id = p.id The shape
The output is per-product, but the count has to come from order_items. A derived table aggregates order_items to one row per product_id first, then joins back to products. The aggregation runs on a narrow intermediate set, and the parent lookup happens once on the already-collapsed result.
Clause by clause
SELECT p.id AS product_id, item_counts.line_item_countreturns the product's identifier alongside the count computed inside the derived table.FROM products pis the parent set. Every catalog product appears here once.- The derived table:
SELECT product_id, COUNT(*) AS line_item_count
FROM order_items
GROUP BY product_idOne row per product that has at least one line item, with the count of line items already computed.
- AS item_counts ON item_counts.product_id = p.id joins the aggregated set back to products on the matching key. Because the derived table holds only products that appear on at least one line item, the inner join silently drops the products with zero line items — which is exactly the "appears on at least one line item" qualifier the prompt requires.
Why this and not an unaggregated join then group
Joining products to order_items first, then grouping the result by p.id, produces the same counts. The intermediate row count, though, is the full line-item count rather than the product count. The derived-table shape keeps the intermediate set narrow — count per product first, lookup second — which is the pattern this problem is practicing.
The trap
COUNT(*) counts rows in the group. COUNT(product_id) would count rows where product_id is non-null. On order_items the two are equivalent because product_id is the join key and cannot be null in a valid row. On other shapes the distinction bites: counting rows is not the same as counting non-null values in a column.
You practiced precomputing per-product line-item counts in a derived table before reattaching to the product list — keeping the count step decoupled from the parent lookup.