Streamhub's platform team is identifying highly active users for an outreach campaign.
Write a query to return the user ID and session count for every user who has recorded more than five sessions.
Assumptions:
- The
sessionstable contains every session ever recorded on the Streamhub platform. user_idlinks each session to the user who initiated it.- The threshold (
> 5) applies to the per-user session count.
Output:
- One row per qualifying user, with columns
user_idandsession_count.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
user_id,
COUNT(*) AS session_count
FROM
sessions
GROUP BY
user_id
HAVING
COUNT(*) > 5 The shape
Same canonical shape as orders-per-customer, in a different domain. GROUP BY user_id builds the per-user session set; HAVING COUNT(*) > 5 keeps only the users with more than five sessions. The outreach campaign's eligible cohort comes out as user_id 1 and user_id 3, both with 9 sessions each.
Clause by clause
SELECT user_id, COUNT(*) AS session_countreturns each qualifying user's id with their session count.COUNT(*)counts the rows in each user's group — every session that landed in the group contributes1.FROM sessionsis the source set: every session ever recorded on Streamhub.GROUP BY user_idpartitions the rows into one group per user. After this clause, each row in the working set represents one user with their total session count aggregated behind it.HAVING COUNT(*) > 5filters those user rows by their session count. Users with five or fewer sessions drop out; six or more survive.
Why this and not WHERE COUNT(*) > 5
The per-user count doesn't exist until grouping has finished. WHERE runs row by row before any grouping happens, so it has no COUNT(*) to compare against. The condition has to wait until the groups have been built, which is exactly the moment HAVING runs.
The structural shape generalises. Anywhere a fact table records repeated events keyed to an entity — sessions per user here, orders per customer in N015-E1, plays per song, opens per email, logins per device — the question "which entities have more than N events" is GROUP BY entity plus HAVING COUNT(*) > N. The schema names change; the shape doesn't.
You practiced applying the HAVING COUNT(*) > N pattern in a different domain (sessions per user). The same structural shape recurs anywhere a fact table records repeated events keyed to an entity — orders per customer, plays per song, opens per email.