Users with Repeat High Event Sessions

The problem

Streamhub's product team wants to know which users have had more than one high-intensity session — a session with at least 5 events.

Write a query to return each qualifying user's ID and their count of high-intensity sessions.

Assumptions:

A high-intensity session has an event_count of 5 or more.
A user's qualifying-session count is the number of high-intensity sessions linked to that user_id.
Only users with more than 1 qualifying session should appear.

Output:

One row per qualifying user, with columns user_id and qualifying_sessions.

Schema · analytics 5 tables

users

id integer

name text

email text

country text

plan text

signed_up_at timestamptz

is_active boolean

conversions

id integer

user_id integer

converted_at timestamptz

plan text

amount numeric

sessions

id integer

user_id integer

started_at timestamptz

ended_at? timestamptz

event_count integer

events

id integer

user_id integer

session_id? integer

event_type text

occurred_at timestamptz

properties? jsonb

periods

id integer

name text

start_month integer

end_month integer

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

WITH
  high_event_sessions AS (
    SELECT
      user_id,
      event_count
    FROM
      sessions
    WHERE
      event_count >= 5
  ),
  user_counts AS (
    SELECT
      user_id,
      COUNT(*) AS qualifying_sessions
    FROM
      high_event_sessions
    GROUP BY
      user_id
  )
SELECT
  user_id,
  qualifying_sessions
FROM
  user_counts
WHERE
  qualifying_sessions > 1

The shape

Three pieces of work spread across two CTEs and the main query. The first CTE keeps only the high-intensity sessions, the second counts those qualifying sessions per user, and the main query applies the "more than one" threshold on top of the counted rows. Two restrictions of different shapes, with a count between them.

Clause by clause

The first CTE drops every low-intensity session:

WITH high_event_sessions AS (
  SELECT user_id, event_count
  FROM sessions
  WHERE event_count >= 5
)

WHERE event_count >= 5 runs against the raw session rows, before any grouping. After this layer, every row is by definition a high-intensity session.

The second CTE counts those rows per user:

user_counts AS (
  SELECT user_id, COUNT(*) AS qualifying_sessions
  FROM high_event_sessions
  GROUP BY user_id
)

GROUP BY user_id produces one row per user, and COUNT(*) counts only the qualifying sessions because that is the only thing in the source layer. A user with zero qualifying sessions never appears here.

The main query applies the threshold on the per-user count:

SELECT user_id, qualifying_sessions
FROM user_counts
WHERE qualifying_sessions > 1

WHERE qualifying_sessions > 1 references the aggregate from the previous layer by its alias. Because the count was materialized into a named column in user_counts, the main query's WHERE can read it as if it were any other column.

Why the count threshold lives in the main query and not back in the first CTE

The two thresholds operate on different things. event_count >= 5 is a per-row condition on the raw sessions table, and it has to run before grouping because the column it tests exists only on the source rows. qualifying_sessions > 1 is a condition on an aggregated value, and the aggregate does not exist until GROUP BY runs. Splitting them across layers is what lets each one run at the level it makes sense at: the per-row filter on the per-row layer, the per-group filter on the per-group layer.

The trap

The two thresholds look similar in the prompt and easy to swap. Reading "more than one high-intensity session" as the row-level threshold would lead to WHERE event_count > 1 in the first CTE, which keeps almost every session and then counts them all. The > 1 is on the count of qualifying sessions per user, not on the events inside each session. The level at which each > applies is what the chained structure is making explicit.

You practiced layering two WITH stages with a final main-query restriction — pre-restrict in the first stage, count in the second, threshold-check in the main query.

Return each qualifying user's ID and their count of high-intensity sessions

The shape

Clause by clause

Why the count threshold lives in the main query and not back in the first CTE

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.