Scenario: Helix Systems' HR analytics team is sizing a planned compensation report that pulls each active employee's salary records and job-history records together, before committing to its compute cost.
Task: Write a query to return the total number of pairings produced when each active employee is matched up with each of their salary records and each of their job-history records simultaneously.
Assumptions:
- An active employee has
is_activeequal toTRUE. - An employee can have multiple salary records.
- An employee can have multiple job-history entries.
- An active employee with 3 salary records and 4 job-history entries contributes 12 pairings to the total — every salary paired with every job-history entry.
Output:
- One row, holding the total pairing count.
- Columns in this order:
row_count.
Schema · hr 4 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
COUNT(*) AS row_count
FROM
employees e
JOIN salaries s ON s.employee_id = e.id
JOIN job_history jh ON jh.employee_id = e.id
WHERE
e.is_active = TRUE The shape
Joining a parent to two independent child collections multiplies the per-parent row count by the cardinality of each child set. An employee with 3 salary records and 4 job-history entries doesn't produce 3 + 4 = 7 rows after both joins — it produces 3 × 4 = 12, because every salary record gets paired with every job-history entry for that employee.
Clause by clause
SELECT COUNT(*) AS row_countcounts the rows produced by the full joined, filtered result. Each row is one (employee, salary record, job-history entry) triple.FROM employees ereads the employees as the driving parent.JOIN salaries s ON s.employee_id = e.idattaches every salary record to its employee. An employee with three salary records contributes three rows after this join.JOIN job_history jh ON jh.employee_id = e.idattaches every job-history entry to its employee. Crucially, this join runs against the already-fanned-out result from the previous join — each of the three salary rows now gets paired with every one of the employee's four job-history entries, producing 12 rows total for that employee.WHERE e.is_active = TRUEkeeps only active employees. The filter applies on the employee side, so all salary records and job-history entries of inactive employees drop out together.
Why this and not summing the two child counts
The compute-cost sizing is asking the right question — what is the post-join row volume? — and the only way to get that number is to actually run the joined count. Adding COUNT(*) FROM salaries and COUNT(*) FROM job_history separately would tell the HR team how many salary records and job-history entries exist on their own, but it would miss the multiplicative interaction between them entirely. The 12-row example in the prompt is the whole point: two collections joined to the same parent multiply, they don't add.
The trap
This is the canonical multi-way fanout shape, and the danger isn't visible until someone tries to aggregate a salary column or a job-history column over this result. Summing s.salary_amount here would count each salary record once per matching job-history entry — four times for the employee in the example. Summing jh.tenure_months would count each job-history entry once per matching salary record — three times. The pairing count is the right metric for compute sizing, but it is the wrong shape for any per-employee, per-salary, or per-history aggregation. The fix in those cases is to pre-aggregate each child collection to one row per employee before bringing it into the same query, so the parent table joins to two one-to-one summaries instead of two one-to-many tables.
You practiced sizing a multi-way join's row explosion — pairing two child collections per parent multiplies the per-parent count by the cardinality of each child set.