Scenario: Streamhub's engagement team flags users whose monthly activity is shifting and needs each user's monthly event count shown alongside their own prior-month count.
Task: Write a query to return each user's user_id, calendar month, total event count in that month, and event count in the immediately preceding month within that same user's own activity history.
Assumptions:
- A user is present in a month only if they have at least one event recorded in that month.
- Each user's
prev_month_countis drawn solely from that same user's earlier activity history. - The earliest month in each user's own history has no preceding month within that user; its
prev_month_countvalue is missing.
Output:
- One row per (
user_id,month) pair present in the data. - Columns in this order:
user_id,month(the first day of the calendar month),event_count,prev_month_count. - Sorted by
user_idascending, thenmonthascending.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
user_id,
DATE_TRUNC('month', occurred_at)::date AS MONTH,
COUNT(*) AS event_count,
LAG(COUNT(*)) OVER (
PARTITION BY
user_id
ORDER BY
DATE_TRUNC('month', occurred_at)
) AS prev_month_count
FROM
events
GROUP BY
user_id,
DATE_TRUNC('month', occurred_at)
ORDER BY
user_id,
MONTH The shape
Each user's monthly activity is its own series, and the prior-month count has to come from that user's history rather than from whoever happened to come before them in the data. PARTITION BY user_id is what isolates the lookback to a single user at a time.
Clause by clause
SELECT user_id, DATE_TRUNC('month', occurred_at)::date AS month, COUNT(*) AS event_count, LAG(COUNT(*)) OVER (PARTITION BY user_id ORDER BY DATE_TRUNC('month', occurred_at)) AS prev_month_countreturns each user's month bucket, that month's event count, and the same user's prior-month count.PARTITION BY user_idresets the lookback at every user boundary;ORDER BY DATE_TRUNC('month', occurred_at)sets the chronological sequence inside each user's partition.FROM eventsreads the event stream.GROUP BY user_id, DATE_TRUNC('month', occurred_at)reduces the events to one row per (user, month) pair so the count is a per-user monthly total.ORDER BY user_id, monthprints each user's history as a contiguous block, in time order.
The trap
The first month of every user's history returns NULL for prev_month_count because there is no prior row inside that user's partition. That NULL is correct — there is no prior-month measurement for the user's first appearance — but the count of NULL rows scales with the number of users. A multi-entity period-over-period report can have a large fraction of its rows reporting NULL in the prior-period column, one per partition, and a downstream aggregate that doesn't account for them will quietly under-count. The behavior is the right one; the consumer needs to know it is there.
You practiced partitioning LAG by user_id so each user's prior-month count is drawn from their own history, never from another user's row.