Join Fanout and Aggregate Correctness in SQL
Join fanout is a correctness problem that occurs when a join multiplies rows beyond what the query intends, causing subsequent aggregations to compute over inflated row counts and produce wrong results. The query runs without error, returns plausible-looking numbers, and gives no indication anything is wrong.
Before this Joining Multiple Tables, GROUP BY, NULL Handling in Joins and Aggregates
Builds toward Analyst Debugging Patterns
Join fanout is what happens when a join multiplies rows beyond what you intended, causing every aggregation that follows to compute over inflated data. The query runs without error, returns plausible-looking numbers, and gives no indication anything is wrong.
Here's the mechanism. You have an orders table and an order_items table, where each order has multiple line items. When you join them on order_id, you get one row per line item — not per order. An order with five items appears five times in the joined result. If you then sum orders.revenue (an order-level column), that revenue figure gets counted five times for the five-item order. The sum is five times too large, and nothing in the query or output flags the problem.
-- Fanout inflating revenue
SELECT
o.customer_id,
SUM(o.order_revenue) AS total_revenue -- wrong: counted once per line item
FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id
GROUP BY o.customer_idCustomers with more line items appear to have disproportionately higher revenue. The error is invisible without comparing the output to a known-correct total.
Diagnosing fanout
Before aggregating, check the row count at the join level:
SELECT COUNT(*) FROM orders o
JOIN order_items oi ON oi.order_id = o.order_idIf this exceeds SELECT COUNT(*) FROM orders, the join is fanning out. That's expected behavior for a one-to-many join — the problem is aggregating an order-level column over the multiplied rows.
Make this check a habit: verify the join row count before writing the aggregation. It's faster to catch fanout here than to trace wrong numbers back through the query after the fact.
The fix: pre-aggregate before joining
Collapse the right-side table to one row per key before the join runs:
WITH item_totals AS (
SELECT order_id, SUM(item_revenue) AS total_item_revenue
FROM order_items
GROUP BY order_id
)
SELECT
o.customer_id,
SUM(o.order_revenue) AS total_revenue
FROM orders o
JOIN item_totals it ON it.order_id = o.order_id
GROUP BY o.customer_idNow the join is one-to-one on order_id. No fanout. The aggregation is correct.
The one thing that trips people up
Fanout compounds across multiple joins. If you join orders to both order_items (5 items) and order_shipments (2 shipments), you get 10 rows per order — one per item-shipment combination. Aggregating any order-level column inflates it by a factor of 10.
The precondition for avoiding fanout is knowing the cardinality of every join before writing it: for each join key, can the right side have more than one row per key value? If yes, aggregating anything from the left side over that join is dangerous unless the right side is pre-aggregated first.
10 Join Fanout and Aggregate Correctness practice problems
Write a query to return the total number of (order, line-item) pairings on record — each line item paired with its parent order.
Write a query to return each order's `id` and the total revenue across its line items in `order_items`.
Write a query to return each order's `id` and the number of line items it contains.
Write a query to return each customer's `customer_id` and their total revenue across all line items in their `orders`.
Write a query to return each category name and the total revenue generated from line items across all `products` in that category.
Write a query to return each department name and the number of salary records on file for active employees in that department.
Write a query to return each order's `id` and the number of line items recorded against it.
Write a query to return the total number of pairings produced when each active employee is matched up with each of their salary records and each of their job-history records simultaneously.
Write a query to return each order's `id` and its `item_revenue` — the combined revenue across its line items, reported as `0` for orders with no recorded line items.
Write a query to return each customer's `customer_id`, the total number of line items purchased across all their `orders`, and the total revenue across those line items.
These problems are part of the Join Fanout and Aggregate Correctness lesson in SQLMaxx, with instant grading and a worked solution on each.
Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.
That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.
Real problems. Not toy examples.
615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.
Write a query. Know if it's right in one second.
No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.
Stop guessing whether you actually know it.
SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.
Learn it once. Keep it for good.
Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.
Practice, feedback, mastery, review. That's the loop that turns reading into real skill.
Start freeNo account, no credit card. Start solving in under a minute.