Helix Systems' workforce analytics team wants to identify which depth levels in the organizational hierarchy carry significant headcount — depth levels with more than 5 employees.
Write a query to return every qualifying depth level and its employee count.
Assumptions:
- The CEO has a missing
manager_idand sits at depth1. Employees one step below the CEO sit at depth2, and so on. - The headcount at a depth level is the number of employees whose computed depth equals that level.
- Only depth levels whose headcount is strictly greater than
5should appear in the result.
Output:
- One row per qualifying depth level, with columns
depthandemployee_count.
Schema · hr 4 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH RECURSIVE
org AS (
SELECT
id,
1 AS depth
FROM
employees
WHERE
manager_id IS NULL
UNION ALL
SELECT
e.id,
o.depth + 1
FROM
employees e
JOIN org o ON e.manager_id = o.id
),
level_counts AS (
SELECT
depth,
COUNT(id) AS employee_count
FROM
org
GROUP BY
depth
)
SELECT
depth,
employee_count
FROM
level_counts
WHERE
employee_count > 5 The shape
Three stages, two in a chained WITH clause and one in the main query. The first CTE walks the hierarchy and stamps each employee with their depth. The second CTE aggregates that flat result into one row per depth level. The main query then applies the > 5 threshold against the per-level count.
Clause by clause
- The recursive CTE computes per-employee depth:
WITH RECURSIVE org AS (
SELECT id, 1 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, o.depth + 1
FROM employees e
JOIN org o ON e.manager_id = o.id
)Same top-down walk as the unbounded hierarchy traversal: anchor on the CEO at depth 1, recurse downward by matching each employee's manager_id to a row already in org, stamp each pass with one more than the source depth.
- The chained CTE aggregates the recursion's output by level:
level_counts AS (
SELECT depth, COUNT(id) AS employee_count
FROM org
GROUP BY depth
)The second CTE reads org and collapses it into one row per depth value, with COUNT(id) returning the headcount at that level. Depth 1 shows 1, depth 2 shows 9, depth 3 shows 19, depth 4 shows 31.
- The main query thresholds the per-level counts:
SELECT depth, employee_count
FROM level_counts
WHERE employee_count > 5The filter employee_count > 5 runs against the named aggregate column. Depth 1 falls below the threshold and drops out; depths 2, 3, and 4 survive.
Why three stages and not two
The aggregation cannot live inside the recursive member — an aggregate would collapse rows the next pass needs to join against, and the recursion would break. The threshold cannot live in the level_counts CTE either, because that CTE's WHERE runs over the pre-aggregation rows in org, not over the aggregate employee_count. Each layer has one job. The recursion produces per-employee depth, the chained CTE produces per-level counts, and the main query applies the cutoff against the aggregate column. The three-stage shape is the standard way to layer "recurse, then aggregate, then threshold."
The trap
The threshold runs on the aggregate, not on individual employees. WHERE employee_count > 5 in the main query is comparing against the named column produced by COUNT(id) in the chained CTE. The named aggregate only exists once level_counts finishes computing; pushing the same filter into level_counts's WHERE would not work, because that WHERE runs over the pre-aggregation rows in org, where employee_count is not yet a column. The filter has to sit downstream of the aggregation, which is what the three-stage layering makes structurally explicit.
You practiced chaining a WITH RECURSIVE traversal into a follow-up CTE — recurse first to compute per-record depth, aggregate by depth in a second layer, then threshold-check in the main query.