Scenario: Streamhub's customer success team wants to identify high-value converters within each plan tier — users whose total conversion revenue stands out compared to others on the same plan.
Task: Write a query to return every user whose total conversion revenue exceeds the average total conversion revenue across all converting users on the same plan, with their id, plan, country, and total conversion revenue.
Assumptions:
- A converting user has at least one conversion on record.
- A user's total conversion revenue is the combined
amountacross all of theirconversions. - The result covers only users whose total conversion revenue is strictly greater than the average total conversion revenue across all converting users on the same plan.
Output:
- One row per qualifying user.
- Columns in this order:
user_id,plan,country,total_revenue. - Sorted by
total_revenuedescending.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
WITH
user_conversions AS (
SELECT
u.id AS user_id,
u.plan,
u.country,
c.amount
FROM
users u
JOIN conversions c ON c.user_id = u.id
),
user_totals AS (
SELECT
user_id,
plan,
country,
SUM(amount) AS total_revenue
FROM
user_conversions
GROUP BY
user_id,
plan,
country
),
plan_benchmarks AS (
SELECT
user_id,
plan,
country,
total_revenue,
AVG(total_revenue) OVER (
PARTITION BY
plan
) AS plan_avg_revenue
FROM
user_totals
)
SELECT
user_id,
plan,
country,
total_revenue
FROM
plan_benchmarks
WHERE
total_revenue > plan_avg_revenue
ORDER BY
total_revenue DESC The shape
Three CTEs that compute each user's total in one layer, attach a per-plan benchmark via a window function in the next, and keep only the users above their plan's number in the main query. The benchmark is partitioned by plan, so every user gets compared against the average of users on the same plan, not the global average.
Clause by clause
WITH user_conversions AS (
SELECT u.id AS user_id, u.plan, u.country, c.amount
FROM users u
JOIN conversions c ON c.user_id = u.id
)The inner join attaches each conversion to its user. Users with no conversions never appear because the join drops them, which matches the "converting users" requirement. plan and country are carried forward because both belong in the output and plan is the partition key downstream.
user_totals AS (
SELECT user_id, plan, country, SUM(amount) AS total_revenue
FROM user_conversions
GROUP BY user_id, plan, country
)GROUP BY user_id, plan, country collapses the conversions to one row per user. SUM(amount) is the user's lifetime conversion revenue. The identifying columns ride along because the comparison layer needs them.
plan_benchmarks AS (
SELECT user_id, plan, country, total_revenue,
AVG(total_revenue) OVER (PARTITION BY plan) AS plan_avg_revenue
FROM user_totals
)AVG(total_revenue) OVER (PARTITION BY plan) computes the average of total_revenue within each plan and broadcasts that single value onto every row in the partition. Every user row now sits next to its plan's average, without a self-join or scalar subquery.
SELECT user_id, plan, country, total_revenue FROM plan_benchmarks WHERE total_revenue > plan_avg_revenue ORDER BY total_revenue DESCkeeps only the users whose total exceeds their plan's benchmark. Fourprousers at499clear that plan's bar, eightstarterusers at49clear theirs.
Why a window function and not a separate aggregate CTE
The benchmark could be written as a separate CTE that groups by plan and produces one average per plan, then joined back to user_totals. The result matches. The window form is tighter for this shape because the comparison is the only thing the benchmark is used for: there is no other layer that consumes the per-plan average. Computing it inline via OVER (PARTITION BY plan) keeps the calculation next to the row it is compared against. When the benchmark itself becomes a deliverable that other layers want to read, the aggregate-and-rejoin shape pays off; here it would just add a layer for no gain.
The trap
The partition key controls which population the comparison measures against. AVG(total_revenue) OVER () with an empty window would compute one global average and compare every user to it — a pro user at 499 would lose to the high pro average while a starter user at 49 would clear the much lower global number, even though the prompt asks for a per-plan comparison. PARTITION BY plan is what makes "compared to others on the same plan" mean exactly that. The clause is small, but it is the difference between answering the question and answering a different one.
You practiced computing each user's total in one CTE, attaching a per-plan window benchmark in another, then keeping only users above their plan's benchmark — three layers, one comparison.