Brightlane's sales team is identifying active buyers for a loyalty campaign and wants to know which customers have placed at least one order.
Write a query to return each customer ID that appears in the orders data, with no duplicates.
Assumptions:
- The
orderstable contains every order Brightlane has processed. - The
customer_idcolumn links each order to the customer who placed it; customers with many orders appear many times.
Output:
- One row per customer who has placed at least one order, with a single column
customer_id.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT DISTINCT
customer_id
FROM
orders The shape
DISTINCT customer_id turns the orders table — which has one row per order — into a presence-list of buyers, one row per customer who has placed at least one order. A customer with twelve orders shows up once; a customer with one shows up once; a customer with zero doesn't show up at all.
Clause by clause
SELECT DISTINCT customer_idreturns the uniquecustomer_idvalues present in whatever rowsFROMhands up. Every order contributes itscustomer_idto the candidate set, and the deduplication runs across that set to produce one row per buyer.FROM ordersis the row source. Critically, this is the transactional table — the orders themselves — not thecustomerstable. That asymmetry is what makes the result a presence-list rather than a customer roster.
Why this and not SELECT customer_id FROM customers
Reading from customers would give back every customer Brightlane has on file, including the ones who have never placed an order. The loyalty campaign is targeting active buyers, not signups. The presence semantics come from reading the orders table — being in the result means having a row in orders, which by definition means having placed at least one order.
This is the everyday pattern behind cohort-building and audience-derivation: read the fact table, deduplicate to the entity column, and the result is exactly the population that has the behaviour the fact table records. The same shape recovers "users who have logged a session," "products that have been reviewed," "employees who have submitted an expense report" — wherever the question is "who shows up here."
You practiced using DISTINCT to derive a presence-list from a transactional table. "Which entities show up in this fact table" is the recurring question behind cohort-building, audience-derivation, and join-key sanity checks.