Helix Systems' HR team wants to find large departments — those with 5 or more staff on record.
Write a query to return the department ID and headcount for every large department.
Assumptions:
- The
employeestable has one row per employee with adepartment_id. - A department's headcount is the number of
employeesrecords linked to thatdepartment_id. - Only departments with
5or more employees should appear.
Output:
- One row per qualifying department, with columns
department_idandheadcount.
Schema · hr 4 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
dept_counts AS (
SELECT
department_id,
COUNT(*) AS headcount
FROM
employees
GROUP BY
department_id
),
large_depts AS (
SELECT
department_id,
headcount
FROM
dept_counts
WHERE
headcount >= 5
)
SELECT
department_id,
headcount
FROM
large_depts The shape
Two named layers in one WITH clause. The first counts employees per department, and the second reads that result and keeps only departments with 5 or more on the headcount. The threshold check runs against the already-counted rows, not against the raw employees table.
Clause by clause
The first CTE reduces the employee roster to one row per department with its headcount:
WITH dept_counts AS (
SELECT department_id, COUNT(*) AS headcount
FROM employees
GROUP BY department_id
)GROUP BY department_id produces one row per department, and COUNT(*) counts the rows inside each group. AS headcount names the aggregate so the next layer can refer to it.
The second CTE reads those counts and keeps the large ones:
large_depts AS (
SELECT department_id, headcount
FROM dept_counts
WHERE headcount >= 5
)FROM dept_counts reads the counted result as if it were a table. WHERE headcount >= 5 drops the smaller departments. The six survivors are departments 1, 2, 3, 4, 7, and 8, with headcounts ranging from 5 to 17.
SELECT department_id, headcount FROM large_deptsreturns the final layer unchanged. The work happens in the two CTEs; the main query just reads the last one.
You practiced layering two WITH stages — compute the per-department count in one named layer, then read from that layer in a second named layer that applies the threshold check.