Customer Order Totals as a Scalar Subquery — Subqueries vs CTEs vs Joins in SQL

The problem

Scenario: Brightlane's customer service team needs a complete customer list paired with the combined value of each customer's orders.

Task: Write a query to return each customer's name and their total_order_value — the combined total_amount across their orders, reported as a missing value for customers who have no orders on record.

Assumptions:

The result covers every customer.
A customer with no orders on record appears with total_order_value reported as a missing value.

Output:

One row per customer.
Columns in this order: customer_name, total_order_value.

Schema · ecommerce 5 tables

The shape

A correlated scalar subquery in the SELECT list runs once per customer, summing that customer's order amounts. The outer query is just FROM customers — every customer appears, and each row's total_order_value is computed in place.

Clause by clause

SELECT c.name AS customer_name reads the customer's name from the outer driver.
(SELECT SUM(o.total_amount) FROM orders o WHERE o.customer_id = c.id) AS total_order_value is the per-row computation. The reference to c.id inside the subquery is what makes it correlated: each outer customer row supplies a different c.id, the subquery filters orders to that customer, and SUM returns one scalar. For a customer with no matching orders, the filter is empty and SUM returns NULL — which is the contract the prompt asks for.
FROM customers c drives the outer loop. Every customer appears once because nothing filters them out.

Why this and not a `LEFT JOIN` with `GROUP BY`

SELECT c.name, SUM(o.total_amount) FROM customers c LEFT JOIN orders o ON o.customer_id = c.id GROUP BY c.name produces the same rows. The two shapes have different costs:

-- Correlated form: reads as "for each customer, sum their orders"
-- Runs once per customer row, which is fine on small customer tables.
SELECT c.name AS customer_name,
    (SELECT SUM(o.total_amount) FROM orders o WHERE o.customer_id = c.id) AS total_order_value
FROM customers c

-- Join + GROUP BY form: aggregates the joined result as a whole.
-- Generally better at scale; the per-row framing is less direct.
SELECT c.name AS customer_name, SUM(o.total_amount) AS total_order_value
FROM customers c LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.name

Both are correct. The correlated form is more directly expressive of "for each customer, find their total"; the join form scales better as orders grows. On Brightlane's customer table either is fine.

The trap

SUM returns NULL on an empty input set, not 0. The prompt requires NULL for customers with no orders, so leaving the scalar uncoalesced is exactly right. Wrapping it in COALESCE(..., 0) would break the contract.

You practiced expressing a per-customer summary as a correlated scalar subquery — a shape that reads naturally as 'for each customer, find their total' even though executing once per customer costs more at scale than a set-based shape.

Return each customer's name and their `total_order_value` — the combined `total_amount` across their `orders`, reported as a missing value for customers who have no orders on record

The shape

Clause by clause

Why this and not a `LEFT JOIN` with `GROUP BY`

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not a LEFT JOIN with GROUP BY

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not a `LEFT JOIN` with `GROUP BY`