Distinct Event Types per Session — STRING_AGG and ARRAY

The problem

Streamhub's product team needs every unique event type each session produced — each event type listed once per session, no repeats.

Write a query to return every session ID and a comma-separated list of the unique event types that occurred in that session, in alphabetical order.

Assumptions:

The events table has one row per event with a session_id and an event_type.
Each session_id with at least one event should appear once.
For each session, the types list contains each event_type value at most once (regardless of how many events of that type the session generated), arranged alphabetically and separated by ', '.

Output:

One row per session with at least one event, with columns session_id and distinct_event_types.

Schema · analytics 5 tables

users

id integer

name text

email text

country text

plan text

signed_up_at timestamptz

is_active boolean

conversions

id integer

user_id integer

converted_at timestamptz

plan text

amount numeric

sessions

id integer

user_id integer

started_at timestamptz

ended_at? timestamptz

event_count integer

events

id integer

user_id integer

session_id? integer

event_type text

occurred_at timestamptz

properties? jsonb

periods

id integer

name text

start_month integer

end_month integer

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  session_id,
  STRING_AGG(
    DISTINCT event_type,
    ', '
    ORDER BY
      event_type
  ) AS distinct_event_types
FROM
  events
GROUP BY
  session_id

The shape

STRING_AGG(DISTINCT event_type, ', ' ORDER BY event_type) collapses every event in a session into a single text value, but the DISTINCT modifier removes repeated event types before the join happens, so each type appears exactly once. The product team gets one row per session showing only the unique event types that occurred, in alphabetical order.

Clause by clause

SELECT session_id, STRING_AGG(DISTINCT event_type, ', ' ORDER BY event_type) AS distinct_event_types returns the session and its deduplicated event-type list. DISTINCT inside the aggregate runs the deduplication pass over the input values before they are concatenated, which is why a session with three 'feature_used' events and two 'page_view' events comes out as 'feature_used, page_view' and not 'feature_used, feature_used, feature_used, page_view, page_view'. The ORDER BY event_type fixes the alphabetical sequence inside the value.
FROM events reads the event rows. Every event is in scope; the deduplication happens inside the aggregate, not at the row level.
GROUP BY session_id partitions the rows by session so the aggregate runs once per session. One output row per distinct session_id, matching the per-session output spec.

The trap

DISTINCT and ORDER BY inside the same aggregate have to agree on what counts as a unique value. STRING_AGG(DISTINCT event_type, ', ' ORDER BY event_type) is fine because both expressions reference the same column. Writing ORDER BY some_other_column next to DISTINCT event_type raises a syntax error in PostgreSQL: the planner cannot deduplicate on one expression and order on another, because the order of duplicates inside each unique value would be undefined. When you reach for DISTINCT inside the aggregate, the ORDER BY expression has to match the DISTINCT expression exactly.

You practiced STRING_AGG(DISTINCT column, separator ORDER BY column) — deduplicate values inside the aggregate before joining; the ORDER BY expression must match the DISTINCT expression.

Return every session ID and a comma-separated list of the unique event types that occurred in that session, in alphabetical order

The shape

Clause by clause

The trap

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.