Unique Customers per Status — GROUP BY in SQL

The problem

Brightlane's CRM team is assessing how broadly each order status touches the customer base.

Write a query to return each status and the number of unique customers who have placed at least one order in that status.

Assumptions:

The orders table contains every order Brightlane has processed.
A customer with multiple orders in the same status should be counted once for that status (not once per order).

Output:

One row per status value, with columns status and unique_customers.

Schema · ecommerce 5 tables

categories

id integer

name text

parent_id? integer

products

id integer

name text

category_id integer

price numeric

stock_qty integer

attributes? jsonb

order_items

id integer

order_id integer

product_id integer

quantity integer

unit_price numeric

customers

id integer

name text

email text

city? text

country text

created_at timestamptz

is_active boolean

orders

id integer

customer_id integer

ordered_at timestamptz

status text

total_amount numeric

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  status,
  COUNT(DISTINCT customer_id) AS unique_customers
FROM
  orders
GROUP BY
  status

The shape

COUNT(DISTINCT customer_id) deduplicates customers inside each status bucket before counting them. A buyer with three delivered orders contributes one to the delivered count, not three. The delivered row shows 59 unique customers across 161 total delivered orders, which is exactly the spread between an order count and a customer count.

Clause by clause

SELECT status, COUNT(DISTINCT customer_id) AS unique_customers returns the status label and the deduplicated customer count for that status. status is in GROUP BY; the count is an aggregate.
FROM orders is the input population.
GROUP BY status partitions the orders by status before the count runs. COUNT(DISTINCT customer_id) then runs inside each partition, and DISTINCT only deduplicates within the partition it is running in.

Why this and not `COUNT(*)`

COUNT(*) per status answers "how many orders are in each status," which double-counts a buyer whose three orders all landed in the same status. The CRM team wants reach, not volume. COUNT(DISTINCT customer_id) strips the repeats inside each bucket, so the result is the number of unique buyers touched by each status. The two answers differ by exactly the amount of repeat business inside each pipeline stage.

The trap

The same customer can appear in the count for multiple statuses, and that is correct, not a bug. DISTINCT deduplicates within a group, not across groups. A buyer with one delivered order and one cancelled order adds one to the delivered count and one to the cancelled count. Summing the four unique_customers values across statuses therefore does not give you the total number of distinct customers in the orders table. For that, you would need COUNT(DISTINCT customer_id) over the whole table with no GROUP BY.

You practiced combining COUNT(DISTINCT col) with GROUP BY. The aggregate's per-group behavior compounds: DISTINCT deduplicates within each group independently, so the same customer can appear in the count of two different statuses without contradiction.

Return each `status` and the number of unique customers who have placed at least one order in that status

The shape

Clause by clause

Why this and not `COUNT(*)`

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not COUNT(*)

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not `COUNT(*)`