Brightlane's merchandising team wants to identify loyal repeat buyers by analyzing how often each customer has purchased each product.
Write a query to return one row per customer-product pair on record, showing the customer ID, customer name, product ID, and the number of times that customer has purchased that product.
Assumptions:
- A customer's purchase count for a given product is the number of line items linked to any of that customer's orders for that product.
- Each customer-product pair on record should appear once. Customers who have never placed an order do not appear in the result.
Output:
- One row per customer-product pair on record, with columns
customer_id,name,product_id, andpurchase_count.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
c.id AS customer_id,
c.name,
purchase_data.product_id,
purchase_data.purchase_count
FROM
customers c
CROSS JOIN LATERAL (
SELECT
oi.product_id,
COUNT(*) AS purchase_count
FROM
order_items oi
WHERE
oi.order_id IN (
SELECT
id
FROM
orders o
WHERE
o.customer_id = c.id
)
GROUP BY
oi.product_id
) purchase_data The shape
The lateral runs per customer and produces one row per distinct product that customer has ever purchased, with the count of times they bought it. The outer customer row is duplicated once per product-pair, which is the per-customer-per-product shape the merchandising team needs. Customers who have never placed an order produce zero lateral rows and CROSS JOIN drops them, which matches the prompt.
Clause by clause
SELECT c.id AS customer_id, c.name, purchase_data.product_id, purchase_data.purchase_countreturns the customer's identity from the outer table and the per-product roll-up from the lateral.FROM customers cis the driving table; the lateral is evaluated once per customer.CROSS JOIN LATERAL ( ... ) purchase_datais where the work happens. Inside it,FROM order_items oireads every line item,WHERE oi.order_id IN (SELECT id FROM orders o WHERE o.customer_id = c.id)restricts to line items belonging to this customer's orders,GROUP BY oi.product_idcollapses those line items into one row per distinct product, andSELECT oi.product_id, COUNT(*) AS purchase_countreturns the product id and its purchase count.
Why this and not a flat three-table join
FROM customers c JOIN orders o ON o.customer_id = c.id JOIN order_items oi ON oi.order_id = o.id GROUP BY c.id, c.name, oi.product_id returns the same rows on this data and is a perfectly valid solution. The lateral form expresses the question as "for each customer, group their purchases by product," which mirrors how the merchandising team is actually thinking about the data. Two-stage joins flatten the customer scope into the same plane as the products; the lateral keeps the two scopes nested, which is easier to extend (a per-customer ORDER BY ... LIMIT on top products would slot straight in).
The trap
Two distinct correlations live inside this lateral: the IN subquery against orders correlates on c.id, not on oi.order_id. If the inner subquery is written without that correlation (SELECT id FROM orders with no WHERE), the IN clause matches every order in the system, and every customer ends up with every product anyone ever bought. The lateral's outer reference is what scopes the work to this customer's orders; lose it and the result silently expands into a Cartesian-style explosion that still parses and still runs.
You practiced CROSS JOIN LATERAL over an inner query that itself groups over a related-records set — the lateral returns one row per group within each customer's purchase history.