Streamhub's product team needs every user's most recent session to identify current engagement.
Write a query to return one row per user with at least one session, showing the user's ID, the ID of their most recent session, when it started, and the event count. Sort the final result by user_id ascending.
Assumptions:
- A user's most recent session is the session with the largest
started_atfor thatuser_id. - Users with no sessions on record do not appear in the result.
- The final result is sorted by
user_idascending.
Output:
- One row per user with at least one session, with columns
user_id,session_id,started_at, andevent_count. Sorted byuser_id.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT DISTINCT
ON (user_id) user_id,
id AS session_id,
started_at,
event_count
FROM
sessions
ORDER BY
user_id,
started_at DESC The shape
DISTINCT ON (user_id) keeps one row per user, and ORDER BY user_id, started_at DESC decides which session that row comes from — the one with the largest started_at. The result is one session row per user: their most recent one, with the session id and event count attached.
Clause by clause
SELECT DISTINCT ON (user_id) user_id, id AS session_id, started_at, event_countreturns the four columns the engagement review needs. TheDISTINCT ON (user_id)part declares the deduplication key: one row per distinctuser_id. Theid AS session_idalias names the session identifier column.FROM sessionsreads the session records. Users with no sessions never enter this row source, so they cannot appear in the result, which matches the prompt.ORDER BY user_id, started_at DESCsorts the sessions so that within each user's group, the most recent session sits first. PostgreSQL walks the sorted rows and keeps the first row for each newuser_idvalue. The leadinguser_idascending also gives the final result the user-ordered shape the prompt asks for.
Why this and not ROW_NUMBER
The same result is reachable with a window function and a subquery:
SELECT user_id, session_id, started_at, event_count
FROM (
SELECT user_id, id AS session_id, started_at, event_count,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY started_at DESC) AS rn
FROM sessions
) ranked
WHERE rn = 1
ORDER BY user_idBoth return the same rows. DISTINCT ON is the idiomatic PostgreSQL shape for one-row-per-group with the top-sorted row kept. ROW_NUMBER is the portable equivalent and is the right tool when the same query needs to run on a database that doesn't support DISTINCT ON.
You practiced the latest-per-group shape with a non-customer key — same DISTINCT ON pattern, different entity column.