Analyst Debugging Patterns in SQL
Analyst debugging is the practice of systematically locating the layer in a complex query where a result diverges from the expected output. The process applies to two distinct classes of problem: wrong results, where the query returns a plausible but incorrect value, and unexpected NULLs or missing rows, where the output is incomplete in ways that are not immediately traceable to the query's structure.
Analyst debugging is the practice of systematically finding the layer in a complex query where the result went wrong. Every debugging problem is one of three things: wrong results, inflated aggregates, or unexpected NULLs. Each has a specific diagnostic move.
Wrong results: isolate each CTE
The most important property of a multi-CTE query for debugging is that each CTE can be run independently. When the final output is wrong, don't read the entire query — run each CTE in isolation and find the first layer where the output diverges from what it should be.
WITH monthly_totals AS (
SELECT
user_id,
date_trunc('month', order_date)::date AS month,
SUM(revenue) AS monthly_revenue
FROM orders
GROUP BY user_id, date_trunc('month', order_date)
)
-- Run this first: is monthly_totals correct for user_id = 42?
SELECT * FROM monthly_totals WHERE user_id = 42 ORDER BY month;Comment out all CTEs after the one you're inspecting and SELECT from that CTE directly. The output either matches expectations or it doesn't. If it doesn't, the bug is in that CTE's logic. If it does, move to the next layer. The bug is at the first layer where the output is wrong — everything above is a consequence.
Inflated aggregates: count rows at the join stage
If a SUM or AVG is producing numbers that are too large, check for fanout before aggregating:
SELECT COUNT(*) AS joined_rows FROM orders o
JOIN order_items oi ON oi.order_id = o.order_id;
-- Compare to: SELECT COUNT(*) FROM orders;If joined_rows exceeds the order count, the join is multiplying rows. Any order-level column summed over that result will be inflated. Fix: pre-aggregate the many-side table in a CTE before joining.
Unexpected NULLs: trace to the entry point
Steps through the CTE chain — check whether the NULL is present at each stage. The first layer where the NULL appears identifies its source:
- NULL from a LEFT JOIN with no match: fix with COALESCE at the join output layer
- NULL from arithmetic on a NULL column: fix with
COALESCE(column, 0)before the expression - NULL from a window function operating on NULL input: fix in the CTE that feeds the window layer
The one thing that trips people up
All three diagnostic moves follow the same principle: limit scope before drawing conclusions. The full query is too complex to debug as a unit. Run one CTE, count one join, read one node in the EXPLAIN plan. The method — narrow first, then inspect — works regardless of query length.
A query structured for readability (one operation per CTE, descriptive names) is also a query structured for debugging. The same architecture that makes it legible makes it tractable when it produces wrong results.
10 Analyst Debugging Patterns practice problems
Write a query to return two counts: the total number of (order, line-item) pairings on record (`joined_row_count`), and the total number of `orders` (`order_count`).
Write a query to return two counts: the total number of (employee, salary) pairings on record (`joined_row_count`), and the total number of `employees` (`employee_count`).
Write a query to return each order's `id` and its `item_count` — the number of `order_items` matched to that order.
Write a query to return each order's `id`, the `recorded_total` stored on the order, the `item_count` of its line items, and the `item_total` computed from those line items.
Write a query to return each department's `id`, name, total `employee_count`, and `avg_salary` — the average current salary across employees in that department.
Write a query to return each user's `user_id` and `purchase_count` — the count of `'purchase'` events recorded for that user.
Write a query to return each customer's `id`, name, and `total_revenue` — the combined line-item value across all of their `orders`, reported as a missing value for customers who have no orders on record.
Write a query to return each customer's `id`, name, and `total_revenue` — the combined line-item value across their `orders`, with each line item contributing its value exactly once to the customer's total, reported as a missing value for customers with no orders on record.
Write a query to return each department's `id`, name, `employee_count`, and `avg_salary` — the average current salary across that department's employees — for every department.
Write a query to return two metrics: `avg_items_per_order` (the average number of `order_items` per order) and `max_items_per_order` (the largest number of `order_items` attached to any single order).
These problems are part of the Analyst Debugging Patterns lesson in SQLMaxx, with instant grading and a worked solution on each.
Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.
That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.
Real problems. Not toy examples.
615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.
Write a query. Know if it's right in one second.
No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.
Stop guessing whether you actually know it.
SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.
Learn it once. Keep it for good.
Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.
Practice, feedback, mastery, review. That's the loop that turns reading into real skill.
Start freeNo account, no credit card. Start solving in under a minute.