The HR team at Helix Systems is auditing team size across the organization.
Write a query to return the department ID and employee count for every department that has 3 or more employees on record.
Assumptions:
- The
employeestable has one row per employee with adepartment_id. - A department's employee count is the number of
employeesrecords linked to thatdepartment_id. - Only departments with
3or more employees should appear.
Output:
- One row per qualifying department, with columns
department_idandheadcount.
Schema · hr 4 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
dept_headcount AS (
SELECT
department_id,
COUNT(*) AS headcount
FROM
employees
GROUP BY
department_id
)
SELECT
department_id,
headcount
FROM
dept_headcount
WHERE
headcount >= 3 The shape
The WITH layer computes a headcount per department; the main query filters that named result with WHERE headcount >= 3 against the aggregate column. The threshold check happens after the grouping, in a separate named layer.
Clause by clause
- The
WITHclause definesdept_headcount:
WITH dept_headcount AS (
SELECT department_id, COUNT(*) AS headcount
FROM employees
GROUP BY department_id
)GROUP BY department_id partitions employees by department; COUNT(*) counts each partition. Every department with at least one employee gets a row in the layer, headcount attached.
SELECT department_id, headcount FROM dept_headcount WHERE headcount >= 3is the main query. It reads the named layer and keeps the rows whose count is3or more. Department 1 stays in with17; department 3 stays in with11; the other six departments also clear the threshold on this data.
Why this and not a derived table in FROM
A derived table would put the per-department aggregation inside the main query's FROM and apply the same threshold in the same WHERE. The result set is identical either way. The WITH version pulls the aggregation out, names it, and lets the main query read as two clear steps: compute the per-department count, then keep the ones at the cutoff. The threshold against an aggregate is the recurring shape that named layers were built to make legible.
The trap
COUNT(*) only produces headcount after the grouping runs. The named column does not exist on employees itself, only on the layer's output. Trying to write WHERE COUNT(*) >= 3 directly against employees without the layer fails, because the aggregate has not been computed yet. The two-stage shape — group first, filter the named aggregate second — is the structural answer to that ordering problem.
You practiced computing a per-category count in a WITH layer and applying a threshold check in the main query.