Scenario: Streamhub's product team is mapping each session to a user's visit history to understand how session behavior changes over time.
Task: Write a query to return each session's id, the user_id it belongs to, its started_at, and its visit_number — 1 for the user's earliest session, 2 for their second, and so on.
Assumptions:
- Within a user's history, sessions are ordered by
started_atascending. - A session's
visit_numberis its position in the chronological sequence of sessions for the same user, starting at1for the earliest session.
Output:
- One row per session.
- Columns in this order:
session_id,user_id,started_at,visit_number. - Sorted by
user_idascending, thenstarted_atascending.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
id AS session_id,
user_id,
started_at,
COUNT(*) OVER (
PARTITION BY
user_id
ORDER BY
started_at
) AS visit_number
FROM
sessions
ORDER BY
user_id,
started_at The shape
A user's visit_number is just the chronological position of their session inside their own history, which is exactly what a running COUNT(*) produces when partitioned by user_id and ordered by started_at. No CTE is needed; the window function does the work in a single pass.
Clause by clause
SELECT id AS session_id, user_id, started_atreturns the three identifying columns, withidrenamed tosession_idto match the requested output.COUNT(*) OVER (PARTITION BY user_id ORDER BY started_at) AS visit_numberis the load-bearing piece. WithPARTITION BY user_id, the count restarts for each user. WithORDER BY started_at, PostgreSQL applies the window's default frame — rows from the partition start up through the current row's ordering peer group — which makes the count advance by one with each later session. The result is 1, 2, 3, ... in chronological order per user.FROM sessionsreads every session row.ORDER BY user_id, started_atmatches the requested sort and aligns with the window's reading order.
Why this and not ROW_NUMBER() OVER (...)
ROW_NUMBER() would give the same 1, 2, 3, ... sequence on this data and is the more idiomatic choice for a "position in order" question. The reason COUNT(*) OVER (...) works identically here is that started_at is unique per user in the sessions table, so each row is its own peer group inside the window's ordering and the running count advances by exactly one. If two sessions for the same user shared a started_at, COUNT(*) OVER would give both the higher number (tied peers count together in the default range frame), while ROW_NUMBER would still break the tie arbitrarily.
The trap
Forgetting PARTITION BY user_id is the silent failure. The window would still run, ordered globally by started_at, and the count would advance across all users as one stream. User 1's first session might come back as visit 12 because eleven other users had earlier sessions, which would look plausible until someone cross-checked. The partition is what restarts the counter at every new user.
You practiced numbering ordered partitions with a running count over (PARTITION BY user_id ORDER BY started_at) — equivalent to ROW_NUMBER, with the position resetting at each user boundary.