N065-H1 Tier 5 · Expert · hard analytics · Streamhub

Return three counts: the number of users who performed at least one `'page_view'` (`reached_step_1`), the number who performed `'upgrade_clicked'` strictly after their first `'page_view'` (`reached_step_2`), and the number who performed `'purchase'` strictly after their first qualifying `'upgrade_clicked'` (`reached_step_3`)

Part of Sessionization and Funnel Analysis Patterns in SQL

The problem

Scenario: Streamhub's product team is measuring the full three-step conversion funnel from initial engagement through first purchase.

Task: Write a query to return three counts: the number of users who performed at least one 'page_view' (reached_step_1), the number who performed 'upgrade_clicked' strictly after their first 'page_view' (reached_step_2), and the number who performed 'purchase' strictly after their first qualifying 'upgrade_clicked' (reached_step_3).

Assumptions:

  • A user reaches step 1 by performing at least one 'page_view' event at any time.
  • A user reaches step 2 by performing at least one 'upgrade_clicked' event strictly after their earliest 'page_view'.
  • A user reaches step 3 by performing at least one 'purchase' event strictly after their earliest qualifying 'upgrade_clicked'.
  • Every user who reaches step 3 also reaches step 2 and step 1.

Output:

  • One row, holding the three step counts.
  • Columns in this order: reached_step_1, reached_step_2, reached_step_3.
Schema · analytics 5 tables
users
id integer
name text
email text
country text
plan text
signed_up_at timestamptz
is_active boolean
conversions
id integer
user_id integer
converted_at timestamptz
plan text
amount numeric
sessions
id integer
user_id integer
started_at timestamptz
ended_at? timestamptz
event_count integer
events
id integer
user_id integer
session_id? integer
event_type text
occurred_at timestamptz
properties? jsonb
periods
id integer
name text
start_month integer
end_month integer

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
WITH
  step_1 AS (
    SELECT
      user_id,
      MIN(occurred_at) AS step_1_time
    FROM
      events
    WHERE
      event_type = 'page_view'
    GROUP BY
      user_id
  ),
  upgrade_events AS (
    SELECT
      user_id,
      occurred_at
    FROM
      events
    WHERE
      event_type = 'upgrade_clicked'
  ),
  step_2 AS (
    SELECT
      ue.user_id,
      MIN(ue.occurred_at) AS step_2_time
    FROM
      upgrade_events ue
      JOIN step_1 s ON s.user_id = ue.user_id
    WHERE
      (ue.occurred_at - s.step_1_time) > INTERVAL '0 seconds'
    GROUP BY
      ue.user_id
  ),
  purchase_events AS (
    SELECT
      user_id,
      occurred_at
    FROM
      events
    WHERE
      event_type = 'purchase'
  ),
  step_3 AS (
    SELECT
      pe.user_id,
      MIN(pe.occurred_at) AS step_3_time
    FROM
      purchase_events pe
      JOIN step_2 s ON s.user_id = pe.user_id
    WHERE
      (pe.occurred_at - s.step_2_time) > INTERVAL '0 seconds'
    GROUP BY
      pe.user_id
  )
SELECT
  (
    SELECT
      COUNT(*)
    FROM
      step_1
  ) AS reached_step_1,
  (
    SELECT
      COUNT(*)
    FROM
      step_2
  ) AS reached_step_2,
  COUNT(*) AS reached_step_3
FROM
  step_3

The shape

A three-step funnel is the two-step pattern extended by one more CTE. Each step CTE restricts to the users who reached the previous step and finds their earliest qualifying event after the prior step's timestamp. The final select pulls the three user counts.

Clause by clause

  • step_1 filters events to event_type = 'page_view' and groups by user_id with MIN(occurred_at) AS step_1_time. One row per user who ever viewed a page, holding their earliest such timestamp.
  • upgrade_events pre-filters the event log to event_type = 'upgrade_clicked' rows. Staging this once keeps the step-2 join narrow.
  • step_2 joins upgrade_events to step_1 on user_id and keeps only upgrade clicks where (ue.occurred_at - s.step_1_time) > INTERVAL '0 seconds' — strictly after the user's first page view. MIN(ue.occurred_at) picks each user's earliest qualifying upgrade as step_2_time.
  • purchase_events mirrors upgrade_events, pre-filtering to event_type = 'purchase'.
  • step_3 joins purchase_events to step_2 on user_id and keeps only purchases where (pe.occurred_at - s.step_2_time) > INTERVAL '0 seconds' — strictly after the user's qualifying upgrade. The result is one row per user who completed all three in order.
  • The final SELECT uses scalar subqueries (SELECT COUNT(*) FROM step_1) and (SELECT COUNT(*) FROM step_2) for the upstream counts and a plain COUNT(*) FROM step_3 for the deepest step. Each CTE is already one row per user, so the counts are per-user.

Why this and not joining step_3 back to step_1 directly

You could write step_3 to join purchase_events to step_1 rather than to step_2, with a single condition pe.occurred_at > s.step_1_time. That would count users whose purchase came after their first page view, regardless of whether they ever clicked upgrade. That is a different question. The funnel requires each step to happen after the previous step, not after step 1, so each CTE has to chain off the one immediately before it.

The trap

Counting COUNT(DISTINCT user_id) in the outer query against step_3 looks safe but reads as a hedge against duplicates that the CTE already eliminated. Each step CTE is grouped by user_id, so it produces exactly one row per qualifying user. A plain COUNT(*) is correct, and adding DISTINCT here would only mask a different bug (a missing GROUP BY) if one ever crept in. Keep the count honest to the CTE's grain.

You practiced extending the funnel pattern to three steps — each step CTE narrows to users who completed the previous step and finds their earliest qualifying event after the prior step's timestamp.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.