Streamhub's analytics team is reviewing the last recorded event for each session.
Write a query to return one row per session with at least one event, showing the session ID, ID of the most recent event in that session, the event type, and when the event occurred. Sort the final result by session_id ascending.
Assumptions:
- A session's most recent event is the event with the largest
occurred_atfor thatsession_id. - Sessions with no events on record do not appear in the result.
- The final result is sorted by
session_idascending.
Output:
- One row per session with at least one event, with columns
session_id,event_id,event_type, andoccurred_at. Sorted bysession_id.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT DISTINCT
ON (session_id) session_id,
id AS event_id,
event_type,
occurred_at
FROM
events
ORDER BY
session_id,
occurred_at DESC The shape
DISTINCT ON (session_id) keeps one row per session, and ORDER BY session_id, occurred_at DESC picks the last event recorded in each session. The pattern is the same latest-per-group shape used for latest-per-customer, applied to a different schema's primary entity relationship: sessions contain events, and the review wants the freshest event per session.
Clause by clause
SELECT DISTINCT ON (session_id) session_id, id AS event_id, event_type, occurred_atreturns the four columns the analytics review needs.DISTINCT ON (session_id)declares the deduplication key: one row per distinctsession_id. Theid AS event_idalias names the event identifier column.FROM eventsreads the event records. Sessions with no events never enter this row source, so they cannot appear in the result.ORDER BY session_id, occurred_at DESCsorts the events so that within each session's group, the most recent event sits first. PostgreSQL keeps the first row in eachsession_idgroup, which is that session's last event. The leadingsession_idascending also gives the final result the session-ordered shape the prompt asks for.
Why this and not ROW_NUMBER
The same pick is reachable with a window function:
SELECT session_id, event_id, event_type, occurred_at
FROM (
SELECT session_id, id AS event_id, event_type, occurred_at,
ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY occurred_at DESC) AS rn
FROM events
) ranked
WHERE rn = 1
ORDER BY session_idBoth return the same rows. DISTINCT ON says the intent in one clause; ROW_NUMBER says it across a subquery, a window function, and a filter. On event-stream tables where the latest-per-key pattern is common, DISTINCT ON keeps the query readable as one thought.
You practiced DISTINCT ON over an event-stream table — same per-group-latest shape, applied to a different schema's primary entity relationship.