N060-M4 Tier 5 · Expert · medium hr · Helix Systems

Return each department name and the number of `employees` assigned to it, so the analyst can compare the real group count against the planner's estimate

Part of Reading EXPLAIN Output in SQL

The problem

Scenario: Helix Systems' data analyst ran EXPLAIN on a per-department employee count and saw the per-department grouping step reporting 3 output groups — well below the number of departments shown on the current organization chart. The analyst suspects table statistics were never refreshed after the most recent restructuring.

Task: Write a query to return each department name and the number of employees assigned to it, so the analyst can compare the real group count against the planner's estimate.

Assumptions:

  • One row in the result covers every employee assigned to the same department.

Output:

  • One row per department with at least one employee assigned to it.
  • Columns in this order: department_name, employee_count.
Schema · hr 4 tables
departments
id integer
name text
location text
budget numeric
salaries
id integer
employee_id integer
amount numeric
effective_date date
end_date? date
employees
id integer
name text
email text
department_id integer
manager_id? integer
hire_date date
title text
is_active boolean
job_history
id integer
employee_id integer
title text
department_id integer
start_date date
end_date? date

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
SELECT
  d.name AS department_name,
  COUNT(e.id) AS employee_count
FROM
  employees e
  JOIN departments d ON e.department_id = d.id
GROUP BY
  d.name

The shape

The planner reported 3 department groups at the grouping step; the real count is 8 — the gap is exactly the kind of distinct-value misestimate that happens when ANALYZE hasn't run since new departments were added. The query joins employees to departments, groups by department name, and counts. The shape mirrors what EXPLAIN was estimating, so the actual group count lines up against the plan's number directly.

Clause by clause

  • SELECT d.name AS department_name, COUNT(e.id) AS employee_count returns the department name and the count of employees in it. COUNT(e.id) counts each joined employee row, so each department's total comes from its own group.
  • FROM employees e reads the employee records.
  • JOIN departments d ON e.department_id = d.id resolves each employee's department to its name. The inner join drops employees whose department_id doesn't resolve to a department row — which shouldn't happen in a clean schema, but is the right default if it does.
  • GROUP BY d.name partitions the joined rows by department name. Each output row is one department's tally.

The trap

The planner's per-group estimate at a hash-aggregate node depends on the distinct-value statistic for the grouped column. Restructuring that adds new departments without a subsequent ANALYZE departments leaves that statistic frozen at the pre-restructure value — the planner keeps reporting 3 groups even though the table now has 8. The fix isn't a query rewrite; it's running ANALYZE on the table whose distinct-value count went stale. Until then, the planner sizes its hash table for 3 buckets and rehashes when the 4th, 5th, and 6th group show up at runtime — visible as a higher actual cost on the plan node even though the row counts are right.

You practiced cross-checking a planner's per-group estimate against reality — stale statistics often hide newly-added partitions at the grouping step.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.