Scenario: Brightlane's operations team is tracking cumulative order volume separately for each status, so each status has its own independent running total over time.
Task: Write a query to return each (order_month, status) combination, the count of orders with that status in that month, and the running total of orders in that status from the earliest such month through the current month — restarting at the earliest month for each status independently.
Assumptions:
- An order month is identified by its first day.
- A combination's
monthly_countis the count oforderswith thatstatusplaced in that month. - A combination's
cumulative_countis the combinedmonthly_countfor the samestatusfrom the earliest month containing thatstatusthrough the current month — drawn only from entries of the samestatus, never across different statuses.
Output:
- One row per (
order_month,status) combination present in the data. - Columns in this order:
order_month,status,monthly_count,cumulative_count. - Sorted by
statusascending, thenorder_monthascending.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
DATE_TRUNC('month', ordered_at) AS order_month,
status,
COUNT(*) AS monthly_count,
SUM(COUNT(*)) OVER (
PARTITION BY
status
ORDER BY
DATE_TRUNC('month', ordered_at) ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) AS cumulative_count
FROM
orders
GROUP BY
DATE_TRUNC('month', ordered_at),
status
ORDER BY
status,
order_month The shape
Two things compose. The GROUP BY aggregates orders to one row per (month, status) pair with that pair's monthly count, and PARTITION BY status inside the window function keeps each status's running total isolated. Without PARTITION BY, the accumulation would cross status boundaries; with it, every status restarts at its own earliest month.
Clause by clause
SELECT DATE_TRUNC('month', ordered_at) AS order_month, status, COUNT(*) AS monthly_countproduces one row per(month, status)combination with that pair's order count. Orders in March 2022 withstatus = 'cancelled'group together and producemonthly_count = 1for that pair.SUM(COUNT(*)) OVER (PARTITION BY status ORDER BY DATE_TRUNC('month', ordered_at) ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumulative_countaccumulates the monthly counts within each status independently.PARTITION BY statusis the load-bearing piece: the running total for'cancelled'ignores the rows for'delivered'entirely, and the frame resets at each status's earliest month.FROM orders GROUP BY DATE_TRUNC('month', ordered_at), statusaggregates the raw orders to one row per(month, status)pair.ORDER BY status, order_monthsorts the final output by status first so each status's running total reads as a contiguous block.
Why PARTITION BY and not a separate query per status
Without PARTITION BY status, the window function would treat the full result as one stream and the running total would jump between statuses in whatever order the ORDER BY produced. A separate query per status would work but requires knowing every status in advance and stitching the results together. PARTITION BY lets the window function compute every status's independent running total in one pass.
The trap
The query's final ORDER BY status, order_month only changes the display order of the output. The window function's own ORDER BY DATE_TRUNC('month', ordered_at) is what determines which row counts as "previous" inside each partition. These are two independent ordering instructions. Changing the final ORDER BY to anything else (sorting by monthly_count, for instance) would change how the output prints but would not affect the cumulative values, because the window has already computed them against its own ordering.
You practiced partitioning the running total by status, so each status has its own independent cumulative line — without partitioning, the total would cross status boundaries and produce a single global running count.