Streamhub's product analytics team wants every event type recorded in each session as a structured collection rather than flattened into a string.
Write a query to return every session ID and an array of all event types that occurred in that session, with the array elements arranged in alphabetical order.
Assumptions:
- The
eventstable has one row per event with asession_idand anevent_type. - Each
session_idwith at least one event should appear once in the result. - For each session, the array contains every
event_typevalue across that session's events (one element per event, no de-duplication), arranged alphabetically.
Output:
- One row per session with at least one event, with columns
session_idandevent_types.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
session_id,
ARRAY_AGG(
event_type
ORDER BY
event_type
) AS event_types
FROM
events
GROUP BY
session_id The shape
ARRAY_AGG(event_type ORDER BY event_type) packages every event in a session into a single typed array, with the elements arranged alphabetically inside the array. The product team gets one row per session and a structured collection they can hand to downstream code instead of a flattened string.
Clause by clause
SELECT session_id, ARRAY_AGG(event_type ORDER BY event_type) AS event_typesreturns the session and its array of event types.ARRAY_AGGcollects the values into a PostgreSQL array; the result column carries a text-array type because the input expression is text. TheORDER BY event_typeinside the aggregate fixes the element sequence, which is why session 4 reads['checkout', 'feature_used', 'feature_used', 'page_view', 'purchase', 'upgrade_clicked']rather than an unpredictable order.FROM eventsreads the event rows. Every event in the table contributes.GROUP BY session_idpartitions the rows so the aggregate runs once per session. One output row comes out for each distinctsession_id.
Why this and not STRING_AGG
STRING_AGG would also collect the values, but it would flatten them into a comma-separated text value. The prompt asks for a structured collection so downstream code can iterate, index, or unnest. ARRAY_AGG keeps the values typed and addressable; STRING_AGG would force a re-parse downstream.
You practiced ARRAY_AGG(column ORDER BY column) — collect grouped values into a typed PostgreSQL array; downstream code can index, unnest, or compare elements directly.