Frequent Customers from a Subquery — CREATE TABLE AS in SQL

The problem

Brightlane's analyst is building an intermediate dataset of high-activity customers — those with more than 3 orders on record — for a multi-step retention analysis. The dataset is materialized into a temp table once and then read by every downstream stage.

Write a query to return each high-activity customer's ID and order count.

Assumptions:

The orders table has one row per order with a customer_id.
A customer's order count is the number of orders linked to that customer_id.
Only customers whose order count is greater than 3 should appear.

Output:

One row per qualifying customer, with columns customer_id and order_count.

Schema · ecommerce 5 tables

categories

id integer

name text

parent_id? integer

products

id integer

name text

category_id integer

price numeric

stock_qty integer

attributes? jsonb

order_items

id integer

order_id integer

product_id integer

quantity integer

unit_price numeric

customers

id integer

name text

email text

city? text

country text

created_at timestamptz

is_active boolean

orders

id integer

customer_id integer

ordered_at timestamptz

status text

total_amount numeric

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  customer_id,
  order_count
FROM
  (
    SELECT
      customer_id,
      COUNT(*) AS order_count
    FROM
      orders
    GROUP BY
      customer_id
  ) customer_stats
WHERE
  order_count > 3

The shape

Compute the per-customer order count in an inner step, then filter that result to keep only customers whose count exceeds 3. The inner aggregation has to finish before the threshold can be applied, because the threshold is a condition on order_count, a column that does not exist until the aggregation has run. A derived table is what gives the outer query something with that column to filter against.

Clause by clause

(SELECT customer_id, COUNT(*) AS order_count FROM orders GROUP BY customer_id) customer_stats is the inner derived table. It groups orders by customer_id, computes the per-customer order count, and exposes a two-column result that the outer query can treat exactly like a real table. The alias customer_stats names that result.
SELECT customer_id, order_count FROM customer_stats reads the derived table and returns both columns to the final result.
WHERE order_count > 3 filters the derived table to high-activity customers only. The filter runs after the aggregation, so order_count is a real value by the time the comparison runs. Customers whose count is 3 or below are dropped before the result is materialized.

Why this and not a `WHERE` on the raw orders table

WHERE runs before GROUP BY, on individual rows, before any aggregate has been computed. A condition like WHERE order_count > 3 written directly on orders would refer to a column that does not exist on the raw row. The aggregation has to happen first, the derived table exposes its result as a row source, and only then can WHERE apply a condition that compares against the aggregate.

You practiced computing a per-customer count and applying a threshold against it — the kind of restricted aggregate worth caching in a temp table when downstream reports reference it repeatedly.

Return each high-activity customer's ID and order count

The shape

Clause by clause

Why this and not a `WHERE` on the raw orders table

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not a WHERE on the raw orders table

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not a `WHERE` on the raw orders table