Large Departments Staged — Chained CTEs in SQL

The problem

Helix Systems' HR team wants to find large departments — those with 5 or more staff on record.

Write a query to return the department ID and headcount for every large department.

Assumptions:

The employees table has one row per employee with a department_id.
A department's headcount is the number of employees records linked to that department_id.
Only departments with 5 or more employees should appear.

Output:

One row per qualifying department, with columns department_id and headcount.

Schema · hr 4 tables

departments

id integer

name text

location text

budget numeric

salaries

id integer

employee_id integer

amount numeric

effective_date date

end_date? date

employees

id integer

name text

email text

department_id integer

manager_id? integer

hire_date date

title text

is_active boolean

job_history

id integer

employee_id integer

title text

department_id integer

start_date date

end_date? date

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

WITH
  dept_counts AS (
    SELECT
      department_id,
      COUNT(*) AS headcount
    FROM
      employees
    GROUP BY
      department_id
  ),
  large_depts AS (
    SELECT
      department_id,
      headcount
    FROM
      dept_counts
    WHERE
      headcount >= 5
  )
SELECT
  department_id,
  headcount
FROM
  large_depts

The shape

Two named layers in one WITH clause. The first counts employees per department, and the second reads that result and keeps only departments with 5 or more on the headcount. The threshold check runs against the already-counted rows, not against the raw employees table.

Clause by clause

The first CTE reduces the employee roster to one row per department with its headcount:

WITH dept_counts AS (
  SELECT department_id, COUNT(*) AS headcount
  FROM employees
  GROUP BY department_id
)

GROUP BY department_id produces one row per department, and COUNT(*) counts the rows inside each group. AS headcount names the aggregate so the next layer can refer to it.

The second CTE reads those counts and keeps the large ones:

large_depts AS (
  SELECT department_id, headcount
  FROM dept_counts
  WHERE headcount >= 5
)

FROM dept_counts reads the counted result as if it were a table. WHERE headcount >= 5 drops the smaller departments. The six survivors are departments 1, 2, 3, 4, 7, and 8, with headcounts ranging from 5 to 17.

SELECT department_id, headcount FROM large_depts returns the final layer unchanged. The work happens in the two CTEs; the main query just reads the last one.

You practiced layering two WITH stages — compute the per-department count in one named layer, then read from that layer in a second named layer that applies the threshold check.

Return the department ID and headcount for every large department

The shape

Clause by clause

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.