Purchase Counts per User — Debugging Queries in SQL

The problem

Scenario: Streamhub's analytics pipeline tracking per-user conversion is reporting unexpectedly low purchase counts for some users. The analyst inspects the purchase-event layer in isolation to verify the data is complete before checking how it composes into the downstream rollup.

Task: Write a query to return each user's user_id and purchase_count — the count of 'purchase' events recorded for that user.

Assumptions:

A purchase event has event_type equal to 'purchase'.
A user's purchase_count is the count of 'purchase' events recorded for them.
The result covers only users with at least one recorded purchase event.

Output:

One row per qualifying user.
Columns in this order: user_id, purchase_count.
Sorted by purchase_count descending.

Schema · analytics 5 tables

users

id integer

name text

email text

country text

plan text

signed_up_at timestamptz

is_active boolean

conversions

id integer

user_id integer

converted_at timestamptz

plan text

amount numeric

sessions

id integer

user_id integer

started_at timestamptz

ended_at? timestamptz

event_count integer

events

id integer

user_id integer

session_id? integer

event_type text

occurred_at timestamptz

properties? jsonb

periods

id integer

name text

start_month integer

end_month integer

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  user_id,
  COUNT(*) AS purchase_count
FROM
  events
WHERE
  event_type = 'purchase'
GROUP BY
  user_id
ORDER BY
  purchase_count DESC

The shape

Filter events down to purchase rows, group by user, and count. Isolating this single layer is the diagnostic move — running it on its own, before checking how it composes into a larger pipeline, verifies the source data is what the downstream rollup is reading.

Clause by clause

SELECT user_id, COUNT(*) AS purchase_count returns one row per user with their purchase-event count.
FROM events reads the raw event log without any further composition. This is what makes the query a diagnostic: it touches the source layer directly, before any downstream CTE has a chance to compress or transform the data.
WHERE event_type = 'purchase' restricts the rows to purchase events before the grouping happens. The filter runs before the aggregation, so every counted row is already a purchase — the count is "purchases per user," not "events per user filtered down."
GROUP BY user_id collapses the filtered rows into one group per user. COUNT(*) then resolves per group.
ORDER BY purchase_count DESC puts the highest-purchase users first. The reference shows nineteen users with exactly one purchase each, which is itself the diagnostic finding — the unexpectedly-low conversion downstream isn't a bug in the rollup, it's a property of the source layer. Every active purchaser in this dataset has exactly one purchase on file.

Why this and not joining straight to `users`

The prompt is about verifying the purchase-event layer in isolation, not about producing a final conversion report. Joining users in this query would compose two layers at once and reintroduce exactly the kind of opacity the diagnostic is trying to avoid. The next step (after this one returns the expected shape) is to join the counts back to users and check the downstream rollup — but that's the next step, not this one.

You practiced isolating a single intermediate layer of a downstream pipeline and inspecting it directly — the per-user purchase count by itself, before checking how it composes with later layers.

Return each user's `user_id` and `purchase_count` — the count of `'purchase'` events recorded for that user

The shape

Clause by clause

Why this and not joining straight to `users`

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not joining straight to users

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not joining straight to `users`