Scenario: Streamhub's campaign analytics dashboard tracks daily purchase activity for the first week of March 2024 to evaluate a recent promotion's reach.
Task: Write a query to return each date from March 1, 2024 through March 7, 2024 alongside the number of purchase events recorded on that date.
Assumptions:
- The
eventstable holds one row per recorded event, with the timestamp stored inoccurred_atand the kind of activity stored inevent_type. - A purchase event has
event_typeequal to'purchase'. - Some dates in the range have no recorded purchase
events; those dates must still appear in the result with a count of zero.
Output:
- One row per date in the range, including dates with no purchase
events. - Columns in this order:
day,purchase_count. - Sorted by
dayascending.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
spine AS (
SELECT
GENERATE_SERIES('2024-03-01'::date, '2024-03-07'::date, '1 day'::INTERVAL)::date AS DAY
)
SELECT
s.day,
COUNT(p.id) AS purchase_count
FROM
spine s
LEFT JOIN (
SELECT
id,
occurred_at
FROM
events
WHERE
event_type = 'purchase'
) AS p ON p.occurred_at::date = s.day
GROUP BY
s.day
ORDER BY
s.day The shape
The count has to be restricted to purchase events but the spine still has to produce all seven days, so the event_type filter is pushed into a derived table on the right side of the LEFT JOIN. Filtering inside the subquery preserves the join's outer-ness; filtering in the outer WHERE would destroy it.
Clause by clause
WITH spine AS (SELECT generate_series('2024-03-01'::date, '2024-03-07'::date, '1 day'::interval)::date AS day)builds the seven-day backbone for the first week of March.- The derived table
(SELECT id, occurred_at FROM events WHERE event_type = 'purchase') AS ppre-filters the fact rows down to purchase events only, leaving every other row out of the join input entirely. SELECT s.day, COUNT(p.id) AS purchase_countreturns the spine's date and the count of matched purchase events.COUNT(p.id)ignores nulls, so days with no purchases report zero.FROM spine s LEFT JOIN ... AS p ON p.occurred_at::date = s.dayattaches each purchase to its day. TheLEFT JOINkeeps every spine row whether or not a purchase matches.GROUP BY s.daycollapses the joined rows back to one row per spine date.ORDER BY s.dayreturns the seven dates in calendar order.
Why this and not a WHERE event_type = 'purchase' after the join
Filtering on the right table of a LEFT JOIN in the outer WHERE silently converts it to an inner join. The rule is positional: WHERE p.event_type = 'purchase' rejects any row where p.event_type is anything other than the literal 'purchase' — and an unmatched spine row carries p.event_type = NULL, which fails that equality check. Every zero-purchase day drops out. Pushing the filter into the derived table fixes this by restricting the right side before the join sees it, so the unmatched-spine rows survive intact.
The trap
The two filters look equivalent on a whiteboard — both restrict to purchases. They behave the same on an inner join. On a LEFT JOIN they don't: the outer-WHERE form quietly drops the zero rows the spine pattern exists to keep. Whenever you're filtering the right side of a left join, put the filter inside a derived table or move it into the ON clause.
You practiced restricting the right side of a left-join to a category-specific subset so the spine still produces one row per day, with zero on days where no purchases occurred.