N058-H2 Tier 5 · Expert · hard hr · Helix Systems

Return each active employee with a current salary on record, with their name, department name, current salary, the average current salary across all active employees in their department, and the difference between their salary and that department average

Part of Multi-CTE Query Architecture in SQL

The problem

Scenario: Helix Systems' compensation team is conducting a salary equity review and needs each active employee's current salary placed alongside their department's average current salary.

Task: Write a query to return each active employee with a current salary on record, with their name, department name, current salary, the average current salary across all active employees in their department, and the difference between their salary and that department average.

Assumptions:

  • An active employee has is_active equal to TRUE.
  • The current salary is the salary record whose end_date is a missing value.
  • The salary_vs_avg value is the employee's current salary minus the average current salary of their department; positive values indicate above-average compensation.

Output:

  • One row per active employee with a current salary on record.
  • Columns in this order: employee_name, department_name, salary, dept_avg_salary, salary_vs_avg.
  • Sorted by department_name ascending, then salary descending within each department.
Schema · hr 4 tables
departments
id integer
name text
location text
budget numeric
salaries
id integer
employee_id integer
amount numeric
effective_date date
end_date? date
employees
id integer
name text
email text
department_id integer
manager_id? integer
hire_date date
title text
is_active boolean
job_history
id integer
employee_id integer
title text
department_id integer
start_date date
end_date? date

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
WITH
  current_salaries AS (
    SELECT
      e.id AS employee_id,
      e.name AS employee_name,
      e.department_id,
      d.name AS department_name,
      s.amount AS salary
    FROM
      employees e
      JOIN departments d ON e.department_id = d.id
      JOIN salaries s ON s.employee_id = e.id
    WHERE
      e.is_active = TRUE
      AND s.end_date IS NULL
  ),
  dept_averages AS (
    SELECT
      department_id,
      AVG(salary) AS dept_avg_salary
    FROM
      current_salaries
    GROUP BY
      department_id
  ),
  salary_comparison AS (
    SELECT
      cs.employee_name,
      cs.department_name,
      cs.salary,
      da.dept_avg_salary,
      cs.salary - da.dept_avg_salary AS salary_vs_avg
    FROM
      current_salaries cs
      JOIN dept_averages da ON da.department_id = cs.department_id
  )
SELECT
  employee_name,
  department_name,
  salary,
  dept_avg_salary,
  salary_vs_avg
FROM
  salary_comparison
ORDER BY
  department_name,
  salary DESC

The shape

Three CTEs that compute the per-department average in its own named layer and then re-attach each employee back to that average for the row-level comparison. The first builds the active-employee-with-current-salary set, the second collapses it to one average per department, and the third joins those two layers together so every employee sits next to their department's number.

Clause by clause

WITH current_salaries AS (
    SELECT e.id AS employee_id, e.name AS employee_name, e.department_id, d.name AS department_name, s.amount AS salary
    FROM employees e
    JOIN departments d ON e.department_id = d.id
    JOIN salaries s ON s.employee_id = e.id
    WHERE e.is_active = TRUE
    AND s.end_date IS NULL
)

Three tables come together to build the qualifying employee set: employees for identity, departments for the readable name, salaries for the current amount. The two predicates restrict to active employees with an open salary record. department_id is carried alongside department_name because the next two layers use the ID as the join key.

dept_averages AS (
    SELECT department_id, AVG(salary) AS dept_avg_salary
    FROM current_salaries
    GROUP BY department_id
)

The per-department average is computed in its own layer over the qualifying set. GROUP BY department_id produces one row per department with a single dept_avg_salary value. Data sits at 155,333.33, Engineering at the upper end of the range.

salary_comparison AS (
    SELECT cs.employee_name, cs.department_name, cs.salary, da.dept_avg_salary,
           cs.salary - da.dept_avg_salary AS salary_vs_avg
    FROM current_salaries cs
    JOIN dept_averages da ON da.department_id = cs.department_id
)

Every employee row is joined back to its department's average through department_id. The join is the mechanism that broadcasts the single per-department number across every employee in that department. cs.salary - da.dept_avg_salary computes the gap. Carlos Vega ends up at +94,666.67 against Data's average.

  • SELECT employee_name, department_name, salary, dept_avg_salary, salary_vs_avg FROM salary_comparison ORDER BY department_name, salary DESC orders the output alphabetically by department, then by salary within each department.

Why a separate CTE for the averages and not a window function

AVG(salary) OVER (PARTITION BY department_id) on current_salaries would produce the same per-row average without the second CTE. Both shapes are correct. The reason to factor the average into its own named layer is auditability: the department average is itself a deliverable the compensation team will want to see, and naming it as dept_averages gives that intermediate set a place to be inspected in isolation. A window function buries the computation inside a row-by-row result. A separate CTE puts it on the wall.

The trap

The set the average is computed over has to match the set the comparison runs against. If dept_averages were computed against the raw employees table — including non-active employees or those without a current salary record — the per-employee gap would be measured against a different denominator than the one the prompt describes. Reading dept_averages from current_salaries, not from employees, is what keeps the average anchored to the qualifying set. Same population on both sides of the comparison.

You practiced computing the department average in its own CTE and re-attaching each employee back to that average, so the comparison runs in a separate layer from the summary calculation.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.