Streamhub's product team wants to identify users who have had at least one high-activity session — a session with more than 10 events.
Write a query to return the user ID and total session count for every user with at least one such session.
Assumptions:
- A user's total session count is the number of
sessionsrecords linked to thatuser_id. A user's peak event count is the largestevent_countacross those records. - A user qualifies when their peak event count is greater than
10. - Only qualifying users should appear.
Output:
- One row per qualifying user, with columns
user_idandsession_count.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
user_stats AS (
SELECT
user_id,
COUNT(*) AS session_count,
MAX(event_count) AS max_events
FROM
sessions
GROUP BY
user_id
),
active_users AS (
SELECT
user_id,
session_count
FROM
user_stats
WHERE
max_events > 10
)
SELECT
user_id,
session_count
FROM
active_users The shape
Two named layers, with the first stage producing two statistics per user and the second stage filtering on one of them while returning the other. The session count and the peak event count are computed side by side in one pass, then the threshold check on max_events decides which users qualify and the session_count is what comes out.
Clause by clause
The first CTE collapses every user's sessions into a single statistics row:
WITH user_stats AS (
SELECT user_id, COUNT(*) AS session_count, MAX(event_count) AS max_events
FROM sessions
GROUP BY user_id
)GROUP BY user_id produces one row per user. COUNT(*) counts that user's sessions and MAX(event_count) reports the largest event count across those sessions. Both aggregates compute in the same pass, so the cost of carrying the extra column is small.
The second CTE keeps the users whose peak session was a high-activity one:
active_users AS (
SELECT user_id, session_count
FROM user_stats
WHERE max_events > 10
)WHERE max_events > 10 checks the peak from the first layer. The SELECT list drops max_events because the spec only asks for user_id and session_count; once max_events has done its qualifying work, it does not need to leave the layer.
SELECT user_id, session_count FROM active_usersreturns the qualifying users and their session counts.
The trap
max_events > 10 is a peak check, not an average check. A user with two zero-event sessions and one twelve-event session qualifies because their single high session pushed the MAX above the threshold. Reading the prompt as "average events per session above ten" would lead to AVG(event_count) > 10 and a different result set. The wording matters: "at least one high-activity session" is exactly what MAX(...) > 10 tests.
You practiced layering two WITH stages where the first computes multiple per-user statistics (a count and a max) and the second reads only one of them through a threshold check.