Scenario: Brightlane's finance team tracks year-to-date revenue that resets at the start of each calendar year.
Task: Write a query to return each order month, its calendar order_year, the revenue for that month, and the ytd_revenue — the running total from the start of that calendar year through that month, restarting at the start of each new year.
Assumptions:
- An order month is identified by its first day; the
order_yearis the four-digit calendar year of the order. - A month's
monthly_revenueis the combinedtotal_amountacrossordersplaced in that month. - A month's
ytd_revenueis the combinedmonthly_revenuefrom January of the same calendar year through that month inclusive — drawn only from months sharing the sameorder_year, restarting at January every year.
Output:
- One row per order month present in the data.
- Columns in this order:
order_month,order_year,monthly_revenue,ytd_revenue. - Sorted by
order_monthascending.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
DATE_TRUNC('month', ordered_at) AS order_month,
EXTRACT(
YEAR
FROM
ordered_at
) AS order_year,
SUM(total_amount) AS monthly_revenue,
SUM(SUM(total_amount)) OVER (
PARTITION BY
EXTRACT(
YEAR
FROM
ordered_at
)
ORDER BY
DATE_TRUNC('month', ordered_at) ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS ytd_revenue
FROM
orders
GROUP BY
DATE_TRUNC('month', ordered_at),
EXTRACT(
YEAR
FROM
ordered_at
)
ORDER BY
order_month The shape
PARTITION BY EXTRACT(year FROM ordered_at) is what makes the running total reset every January. The window function still orders by month and uses an unbounded-to-current frame, but the partition isolates each year so the accumulation starts over at the first month of each year. Without PARTITION BY, the running total would carry across year boundaries; with it, the December-to-January jump zeroes out and the next year's YTD begins again at that January's revenue.
Clause by clause
SELECT DATE_TRUNC('month', ordered_at) AS order_month, EXTRACT(year FROM ordered_at) AS order_year, SUM(total_amount) AS monthly_revenueproduces one row per(order_year, order_month)pair with the month's revenue. The two date expressions are independent:DATE_TRUNCreturns the month's first day;EXTRACTreturns the four-digit year.SUM(SUM(total_amount)) OVER (PARTITION BY EXTRACT(year FROM ordered_at) ORDER BY DATE_TRUNC('month', ordered_at) ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ytd_revenueaccumulates monthly revenue within each year.PARTITION BY EXTRACT(year FROM ordered_at)splits the row stream by calendar year;ORDER BY DATE_TRUNC('month', ordered_at)orders the months within each year; the unbounded-preceding frame anchors the accumulation at January of the current year, not at the start of the data.FROM orders GROUP BY DATE_TRUNC('month', ordered_at), EXTRACT(year FROM ordered_at)produces the per-month rows the window function then accumulates over.ORDER BY order_monthsorts the final output chronologically. Becauseorder_monthalready encodes the year in its date, sorting by it produces correct year-then-month ordering for free.
Why PARTITION BY year and not a WHERE per year
Filtering to one year at a time would compute YTD correctly for that single year but would require a separate query per year and a way to stitch them together. PARTITION BY EXTRACT(year FROM ordered_at) does the same logical isolation inside one query: each year is a separate partition, so the window function computes a clean restart-from-January YTD for every year present in the data in a single pass.
The trap
PARTITION BY EXTRACT(year FROM ordered_at) and ORDER BY DATE_TRUNC('month', ordered_at) are doing different jobs and they have to be written separately. The partition controls when the running total resets; the order controls which row counts as previous inside the partition. Replacing one with the other (partitioning by month and ordering by year, for instance) would produce twelve partitions of one row each and a YTD that never accumulates. The two clauses are independent and both are load-bearing — partition for the reset boundary, order for the sequence within each partition.
You practiced partitioning the cumulative window by calendar year, so the YTD line restarts at January each year — the partition is what scopes the cumulative layer to one year at a time.