N066-M2 Tier 5 · Expert · medium hr · Helix Systems

Return each department's `id`, name, total `employee_count`, and `avg_salary` — the average current salary across employees in that department

Part of Analyst Debugging Patterns in SQL

The problem

Scenario: Helix Systems' HR salary report is showing a missing-value average salary for two departments. To determine whether the cause is departments with no employees or employees with no current salary records, the diagnostic places both metrics side by side per department.

Task: Write a query to return each department's id, name, total employee_count, and avg_salary — the average current salary across employees in that department.

Assumptions:

  • An active salary record has end_date recorded as a missing value.
  • A department's employee_count is the count of employees assigned to it; departments with no employees appear with employee_count of 0.
  • A department's avg_salary is the average of every active salary across its employees, reported as a missing value when no qualifying salary record is on file.
  • The result covers every department.

Output:

  • One row per department.
  • Columns in this order: department_id, department_name, employee_count, avg_salary.
  • Sorted by department_name ascending.
Schema · hr 4 tables
departments
id integer
name text
location text
budget numeric
salaries
id integer
employee_id integer
amount numeric
effective_date date
end_date? date
employees
id integer
name text
email text
department_id integer
manager_id? integer
hire_date date
title text
is_active boolean
job_history
id integer
employee_id integer
title text
department_id integer
start_date date
end_date? date

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
WITH
  dept_headcount AS (
    SELECT
      d.id AS department_id,
      d.name AS department_name,
      COUNT(e.id) AS employee_count
    FROM
      departments d
      LEFT JOIN employees e ON e.department_id = d.id
    GROUP BY
      d.id,
      d.name
  ),
  dept_avg_salary AS (
    SELECT
      e.department_id,
      AVG(s.amount) AS avg_salary
    FROM
      employees e
      LEFT JOIN salaries s ON s.employee_id = e.id
      AND s.end_date IS NULL
    GROUP BY
      e.department_id
  )
SELECT
  dh.department_id,
  dh.department_name,
  dh.employee_count,
  das.avg_salary
FROM
  dept_headcount dh
  LEFT JOIN dept_avg_salary das ON das.department_id = dh.department_id
ORDER BY
  dh.department_name

The shape

Two CTEs, each measuring one thing about a department, joined back to each other on department_id. dept_headcount answers "does this department have employees?"; dept_avg_salary answers "do those employees have active salary records?" Separating them is what makes a missing-value avg_salary legible — the row's employee_count tells you which side of the chain is empty.

Clause by clause

  • WITH dept_headcount AS (SELECT d.id AS department_id, d.name AS department_name, COUNT(e.id) AS employee_count FROM departments d LEFT JOIN employees e ON e.department_id = d.id GROUP BY d.id, d.name) counts employees per department. The left join keeps every department, and COUNT(e.id) returns 0 for departments with no employees because the unmatched e.id is NULL and COUNT(col) ignores NULLs.
  • dept_avg_salary AS (SELECT e.department_id, AVG(s.amount) AS avg_salary FROM employees e LEFT JOIN salaries s ON s.employee_id = e.id AND s.end_date IS NULL GROUP BY e.department_id) averages active salaries per department. The end_date IS NULL predicate is on the join condition so unmatched employees still contribute a row to the per-department group, and AVG returns NULL when every salary in the group is NULL.
  • SELECT dh.department_id, dh.department_name, dh.employee_count, das.avg_salary pulls the headcount column from one CTE and the salary column from the other onto a single row per department.
  • FROM dept_headcount dh LEFT JOIN dept_avg_salary das ON das.department_id = dh.department_id keeps every department from the headcount side. A department with no employees has no row in dept_avg_salary at all, so the left join produces NULL for avg_salary — the visible signal of an empty-headcount cause.
  • ORDER BY dh.department_name sorts alphabetically. The reference shows Data through Support, each with a non-zero headcount and a real average — meaning on this particular dataset both sides are populated everywhere, but the structure is ready to surface either failure mode the moment one appears.

Why this and not one combined aggregate

A single query joining departments to employees to salaries and aggregating both COUNT and AVG in one SELECT collapses the two measurements into one layer. When the average comes back missing, the analyst has no way to tell whether the chain broke at departments → employees or at employees → salaries. Splitting them into independent CTEs gives each measurement its own provenance, which is the entire point of the diagnostic.

You practiced separating two diagnostic measurements into independent layers — headcount on one side, salary average on the other — so each can be inspected on its own and the source of a missing-value average is obvious from the row.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.