N060-M1 Tier 5 · Expert · medium ecommerce · Brightlane

Return each `status` value and the number of `orders` recorded with that status, so the analyst can see how skewed the distribution actually is

Part of Reading EXPLAIN Output in SQL

The problem

Scenario: Brightlane's data analyst ran EXPLAIN on a status report and saw the planner had assumed roughly even distribution across orders.status values, which led it to choose a sequential scan for a specific-status restriction. The analyst suspects one status dominates the order base and that the even-distribution assumption is what the statistics are missing.

Task: Write a query to return each status value and the number of orders recorded with that status, so the analyst can see how skewed the distribution actually is.

Assumptions:

  • The orders table holds one row per placed order, with the order's outcome stored in status.
  • One row in the result covers every order sharing the same status value.

Output:

  • One row per status value present in the data.
  • Columns in this order: status, order_count.
Schema · ecommerce 5 tables
categories
id integer
name text
parent_id? integer
products
id integer
name text
category_id integer
price numeric
stock_qty integer
attributes? jsonb
order_items
id integer
order_id integer
product_id integer
quantity integer
unit_price numeric
customers
id integer
name text
email text
city? text
country text
created_at timestamptz
is_active boolean
orders
id integer
customer_id integer
ordered_at timestamptz
status text
total_amount numeric

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
SELECT
  status,
  COUNT(*) AS order_count
FROM
  orders
GROUP BY
  status

The shape

The planner assumed roughly even distribution across the four status values, which is exactly what GROUP BY status plus COUNT(*) will refute when one status dominates. The grouped count shows delivered at 161 against cancelled and pending at 11 each — the kind of skew that breaks the uniform-distribution assumption in the planner's statistics.

Clause by clause

  • SELECT status, COUNT(*) AS order_count returns the status value and the row count behind it, so each group's contribution to the total is visible alongside its label.
  • FROM orders reads the order records.
  • GROUP BY status partitions the rows by status value, so COUNT(*) runs once per group instead of once over the whole table. Each output row is one status bucket.

Why this and not a WHERE per status

The analyst could run WHERE status = 'delivered' and then WHERE status = 'cancelled' and so on, four queries, four counts. The grouped version returns the same information in one pass — and crucially, in the same shape as the planner's per-group estimate. When EXPLAIN ANALYZE annotates a group-aggregate node with actual rows=4 (four groups out) and an estimate of rows=4, the matching shape is what makes the gap at the per-group row counts visible. The planner publishes one estimate per group; this query publishes one actual per group; the comparison is direct.

The trap

GROUP BY produces one row per distinct group present in the data. If a status value exists in the schema but has zero matching orders, it won't appear in the result — and won't appear in the planner's estimate either, because the planner is working from the same statistics. A "missing" status in the output is not a bug in the count; it's a real signal that the status has no rows. Don't read absence as zero; read it as absent.

You practiced computing the actual distribution behind a planner assumption — a per-status count reveals the skew that uniform-distribution statistics would hide.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.