Average Events Within Each User — Window Functions in SQL

The problem

Streamhub's engagement team needs to compare each session's event count to the average event count across that session's user.

Write a query to return the ID, user ID, and event count of every session, plus the average event count across that user's sessions on each row.

Assumptions:

The sessions table has one row per session with an id, a user_id, and an event_count.
A user's average event count is the average of event_count across every session linked to that user_id. The same value should appear on every row that shares a user_id.

Output:

One row per session, with columns id, user_id, event_count, and user_avg_events.

Schema · analytics 5 tables

users

id integer

name text

email text

country text

plan text

signed_up_at timestamptz

is_active boolean

conversions

id integer

user_id integer

converted_at timestamptz

plan text

amount numeric

sessions

id integer

user_id integer

started_at timestamptz

ended_at? timestamptz

event_count integer

events

id integer

user_id integer

session_id? integer

event_type text

occurred_at timestamptz

properties? jsonb

periods

id integer

name text

start_month integer

end_month integer

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  id,
  user_id,
  event_count,
  AVG(event_count) OVER (
    PARTITION BY
      user_id
  ) AS user_avg_events
FROM
  sessions

The shape

AVG(event_count) OVER (PARTITION BY user_id) computes a separate average for each user_id and writes that user's average onto every session belonging to the user. Two sessions from the same user share an average; sessions from different users see different averages. Every session row stays in the output.

Clause by clause

SELECT id, user_id, event_count returns each session's identifier, the user who owned it, and the session's individual event count, one row per session.
The window column is:

AVG(event_count) OVER (PARTITION BY user_id) AS user_avg_events

PARTITION BY user_id splits the row set into one group per distinct user_id. AVG(event_count) runs inside each group independently, so the value attached to a given session is the average of event_count across every session that shares its user_id. All sessions belonging to the same user see the same user_avg_events.

FROM sessions reads every session. The engagement team is comparing each session to its user's average, so every row stays in.

Why this and not `GROUP BY user_id`

GROUP BY user_id would collapse the table to one row per user holding the per-user average, and the individual session rows would be gone. The comparison the engagement team is making, this session's event count versus this user's typical event count, requires both numbers on the same row. PARTITION BY is the construct that produces the per-group aggregate without throwing away the rows it summarises.

You practiced partitioning an AVG window by a foreign-key column — every record sees its own user's per-user average alongside the individual value.

Return the ID, user ID, and event count of every session, plus the average event count across that user's sessions on each row

The shape

Clause by clause

Why this and not `GROUP BY user_id`

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

The shape

Clause by clause

Why this and not GROUP BY user_id

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

Why this and not `GROUP BY user_id`