Correlated Subqueries in SQL
A correlated subquery references a column from its outer query, creating a dependency that forces the subquery to be re-evaluated once for each row the outer query processes. The subquery cannot execute independently; its result changes based on which outer row is currently being examined.
A correlated subquery re-runs for each row the outer query processes, using that row's values as input. The subquery's result is different for every outer row — it's not a fixed value computed once and reused.
The clearest use case: find every employee earning above the average salary for their own department. The challenge is that "their department" changes for every employee. Computing one average for the whole table isn't useful — you need a per-department average, evaluated row by row.
With a correlated subquery, you compute the average once per employee, dynamically, using that employee's department:
SELECT
e.employee_id,
e.name,
e.salary,
e.department_id
FROM employees e
WHERE e.salary > (
SELECT AVG(salary)
FROM employees dept_avg
WHERE dept_avg.department_id = e.department_id
)For each employee row, the subquery runs with that employee's department_id. Employee 12 in department 3 triggers the subquery with department_id = 3. Employee 45 in department 7 triggers it with department_id = 7. The outer WHERE uses whichever result corresponds to the current row.
Correlated subqueries also work in the SELECT list to compute a value per outer row:
SELECT
d.department_id,
d.department_name,
(
SELECT COUNT(*)
FROM employees e
WHERE e.department_id = d.department_id
) AS headcount
FROM departments dFor each department, the subquery counts its employees and attaches the result as a column. This is equivalent to a LEFT JOIN with GROUP BY — same output, different structure. The correlated form is sometimes cleaner when you only need one computed value per outer row and the join version would add complexity.
Here's the SELECT-list form on real data:
EXISTS and correlated subqueries
EXISTS — which you've already seen — relies on this same mechanism. The subquery inside EXISTS references the outer row to check whether any matching rows exist at all:
SELECT customer_id
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
)For each customer, the subquery checks whether any orders reference that customer. EXISTS stops as soon as it finds one match — it doesn't need to count them all. This makes it efficient for "does at least one match exist?" questions.
The one thing that trips people up
Correlated subqueries can be slow on large tables because the subquery executes once per outer row. A query with 500,000 outer rows runs the subquery 500,000 times. PostgreSQL's planner can sometimes rewrite them as joins internally, but it's not guaranteed.
When a correlated subquery can be expressed as a window function or a join, those alternatives are usually faster. Correlated subqueries are most appropriate when the per-row computation is genuinely unique and can't be cleanly expressed as a join without significantly complicating the query.
9 Correlated Subqueries practice problems
Write a query to return every customer's ID, name, and total order count.
Write a query to return every customer's ID, name, and highest order amount.
Write a query to return the ID and name of every customer who has placed at least one order.
Write a query to return the order ID, customer ID, and total amount for every order whose `total_amount` exceeds that same customer's average order amount across all their orders.
Write a query to return the ID and name of every customer who has placed more than `3` orders on record.
Write a query to return every department's ID, name, and total employee count.
Write a query to return every order's ID, customer ID, total amount, and the highest unit price among the items in that order.
Write a query to return the order ID, customer ID, and total amount for every order whose `total_amount` equals that same customer's highest order amount.
Write a query to return the product ID, category ID, name, and price for every product whose `price` equals the highest `price` within its own category.
These problems are part of the Correlated Subqueries lesson in SQLMaxx, with instant grading and a worked solution on each.
Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.
That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.
Real problems. Not toy examples.
615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.
Write a query. Know if it's right in one second.
No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.
Stop guessing whether you actually know it.
SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.
Learn it once. Keep it for good.
Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.
Practice, feedback, mastery, review. That's the loop that turns reading into real skill.
Start freeNo account, no credit card. Start solving in under a minute.