Scenario: Helix Systems' HR analytics team needs the total current salary cost broken out by department, drawing only from active employees and their current salary records.
Task: Write a query to return each department_id and its total_salary — the combined current salary across active employees in that department.
Assumptions:
- An active employee has
is_activeequal toTRUE. - A current salary record has
end_daterecorded as a missing value. - A department's
total_salaryis the combinedamountacross the current salary records of its active employees. - The result covers only departments with at least one qualifying record.
Output:
- One row per qualifying department.
- Columns in this order:
department_id,total_salary.
Schema · hr 4 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
active_employees AS (
SELECT
id AS employee_id,
department_id
FROM
employees
WHERE
is_active = TRUE
),
current_dept_salaries AS (
SELECT
ae.department_id,
SUM(s.amount) AS total_salary
FROM
active_employees ae
JOIN salaries s ON s.employee_id = ae.employee_id
WHERE
s.end_date IS NULL
GROUP BY
ae.department_id
)
SELECT
department_id,
total_salary
FROM
current_dept_salaries The shape
Two CTEs cut the problem into two named layers: the first names "who counts" (active employees), the second sums their current salaries by department. Each layer reads as a labeled step; the final SELECT is just a hand-off.
Clause by clause
WITH active_employees AS (SELECT id AS employee_id, department_id FROM employees WHERE is_active = TRUE)names the pool of qualifying employees. The CTE keeps onlyis_active = TRUErows and exposes the two columns the next stage needs.current_dept_salaries AS (SELECT ae.department_id, SUM(s.amount) AS total_salary FROM active_employees ae JOIN salaries s ON s.employee_id = ae.employee_id WHERE s.end_date IS NULL GROUP BY ae.department_id)joins the active pool tosalaries, restricts to current rows withs.end_date IS NULL, and totals by department. TheJOIN(notLEFT JOIN) is deliberate — a department with no active employee or no current salary should not appear, which matches "covers only departments with at least one qualifying record."SELECT department_id, total_salary FROM current_dept_salariesreads the second CTE through unchanged.
Why this and not a single query
SELECT e.department_id, SUM(s.amount) FROM employees e JOIN salaries s ON s.employee_id = e.id WHERE e.is_active = TRUE AND s.end_date IS NULL GROUP BY e.department_id produces the same numbers. The CTE form earns its space here because "active employees" is a concept worth naming — if the next analyst reads the query, the active_employees label tells them what the filter chain means before they trace it. For a one-shot aggregation, the flat form is leaner; pick on whether the named layer pays for itself.
The trap
Switching the second CTE's JOIN to LEFT JOIN would silently change the contract. A department with no current salary record would show up with SUM = NULL instead of being excluded, which contradicts "covers only departments with at least one qualifying record." The inner JOIN is doing the exclusion work; the outer query has no WHERE to fall back on.
You practiced staging the qualifying employees and their current salaries across two CTEs before producing the per-department total — each layer doing one transformation.