Statuses with More Than Ten Customers

The problem

Brightlane's CRM team is identifying the most widely purchased order statuses — those touching the broadest customer base.

Write a query to return each status and its unique-customer count for statuses that have been placed by more than ten different customers.

Assumptions:

The orders table contains every order Brightlane has processed.
A customer with multiple orders in the same status counts once for that status (not once per order).
The threshold (> 10) applies to the per-status unique-customer count.

Output:

One row per qualifying status, with columns status and unique_customers.

Schema · ecommerce 5 tables

The shape

The statuses are the groups; the per-group metric is the count of distinct customers, not the count of orders. COUNT(DISTINCT customer_id) collapses repeat customers to one per status, and HAVING COUNT(DISTINCT customer_id) > 10 keeps only the statuses that span a broad customer base. delivered reaches 59 unique customers, shipped reaches 17, pending reaches 11. Other statuses fall below the threshold and drop out.

Clause by clause

SELECT status, COUNT(DISTINCT customer_id) AS unique_customers returns each status with its unique-customer count. The DISTINCT inside COUNT is what makes a customer with three delivered orders contribute 1 to delivered's count rather than 3.
FROM orders is the source set.
GROUP BY status partitions the orders by their status value. After this clause, each row in the working set represents one status with its underlying order rows aggregated behind it.
HAVING COUNT(DISTINCT customer_id) > 10 filters those status rows by the unique-customer metric. Statuses placed by ten or fewer distinct customers drop out; eleven or more survive.

Why this and not `COUNT(*)`

COUNT(*) and COUNT(DISTINCT customer_id) answer different questions on the same data. COUNT(*) would return the number of orders in each status — delivered would land somewhere above 100 because most orders are delivered and many customers ordered multiple times. COUNT(DISTINCT customer_id) returns the number of customers behind those orders, which is what "placed by more than ten different customers" actually asks. A status with a thousand orders from three customers would clear a COUNT(*) > 10 bar but fail the breadth test the CRM team is running.

The shape generalises. Once an aggregate is computing a per-group number, HAVING can compare it to anything — a literal threshold, another aggregate, even an arithmetic combination of aggregates. The constraint is only that the left side has to be an aggregate, not a raw column reference.

You practiced filtering on a COUNT(DISTINCT col) aggregate. The composability of HAVING with any aggregate is the recurring shape — once an aggregate produces a per-group number, HAVING can compare it to anything.

Return each `status` and its unique-customer count for statuses that have been placed by more than ten different customers

The shape

Clause by clause

Why this and not `COUNT(*)`

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not COUNT(*)

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not `COUNT(*)`