GROUP BY in SQL
GROUP BY partitions a query's rows into groups and directs aggregate functions to operate on each group independently. Without it, an aggregate function collapses the entire row set into one value. With it, the same function produces one value per group.
GROUP BY divides your result into groups and runs an aggregate function on each group separately.
You're building a sales report and the data is already in SQL. You don't want one total. You want revenue broken down by order status. How much came from delivered orders? How much is still pending? That kind of breakdown is exactly what GROUP BY is for.
Without GROUP BY, an aggregate function like COUNT(*) or SUM() collapses the whole table into a single number. GROUP BY changes that: instead of one number for everything, you get one number per group.
Think of it like sorting a stack of receipts into piles before you start counting. One pile per status, one pile per customer. Once sorted, each pile gets its own total. Here's what that looks like:
SELECT status, COUNT(*) AS order_count
FROM orders
GROUP BY statusSQL partitions the orders table into groups, one per distinct status value, then counts the rows in each group. The result has one row per status, not one row per order.
You can use any aggregate function alongside GROUP BY. Revenue per status instead of a count:
SELECT status, SUM(total_amount) AS total_revenue
FROM orders
GROUP BY statusThe column you're grouping by appears in SELECT and in GROUP BY. That's the pattern.
GROUP BY works with WHERE. SQL filters first, then groups and aggregates what survives:
SELECT customer_id, COUNT(*) AS delivered_order_count
FROM orders
WHERE status = 'delivered'
GROUP BY customer_idOnly delivered orders reach the grouping step. The result shows a count per customer, but only for their delivered orders.
The one thing that trips people up: every column in SELECT must either appear in the GROUP BY clause or be wrapped in an aggregate function.
When you group by status, a single status = 'delivered' group can contain orders from dozens of different customers. SQL has no basis for deciding which customer_id to show you, so it refuses the query. That's the right behavior:
The error identifies exactly which column caused the problem. Fix it by adding customer_id to GROUP BY, giving you one row per customer-status pair, or by dropping customer_id from SELECT if you don't need it.
You write: SELECT category_id, name, COUNT(*) FROM products GROUP BY category_id. What happens?
9 GROUP BY practice problems
Write a query to return the number of orders in each `status`.
Write a query to return each `status` and its total order value.
Write a query to return each `department_id` and the number of employees assigned to it.
Write a query to return each customer's ID alongside the count of their delivered orders.
Write a query to return each unique customer-status pairing and the number of orders that fall into it.
Write a query to return each product's ID, the total units sold, and the total revenue across all order line items.
Write a query to return each `status` and the number of unique customers who have placed at least one order in that status.
Write a query to return each `category_id` alongside the number of products assigned to it.
Write a query to return each customer's ID alongside their total spend across non-cancelled orders.
These problems are part of the GROUP BY lesson in SQLMaxx, with instant grading and a worked solution on each.
Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.
That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.
Real problems. Not toy examples.
615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.
Write a query. Know if it's right in one second.
No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.
Stop guessing whether you actually know it.
SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.
Learn it once. Keep it for good.
Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.
Practice, feedback, mastery, review. That's the loop that turns reading into real skill.
Start freeNo account, no credit card. Start solving in under a minute.