N031-H1 Tier 3 · Intermediate · hard analytics · Streamhub

Return the count of top-engagement users and their average total event count as a single row

Part of Chained CTEs in SQL

The problem

Streamhub's analytics team wants to profile its top-engagement segment: users whose session count exceeds the platform-wide per-user session average. The team needs the segment's size and its average total event count.

Write a query to return the count of top-engagement users and their average total event count as a single row.

Assumptions:

  • A user's session count is the number of sessions records linked to that user_id. A user's total events is the combined event_count across those records. Only users with at least one session contribute to the per-user session average.
  • The per-user session average is the average of those per-user session counts.
  • A user belongs to the top-engagement segment when their session count exceeds the per-user session average.
  • The output reports the size of that segment and the average total events across users in the segment.

Output:

  • A single row with columns top_user_count and avg_events_per_top_user.
Schema · analytics 5 tables
users
id integer
name text
email text
country text
plan text
signed_up_at timestamptz
is_active boolean
conversions
id integer
user_id integer
converted_at timestamptz
plan text
amount numeric
sessions
id integer
user_id integer
started_at timestamptz
ended_at? timestamptz
event_count integer
events
id integer
user_id integer
session_id? integer
event_type text
occurred_at timestamptz
properties? jsonb
periods
id integer
name text
start_month integer
end_month integer

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
WITH
  user_stats AS (
    SELECT
      user_id,
      COUNT(*) AS session_count,
      SUM(event_count) AS total_events
    FROM
      sessions
    GROUP BY
      user_id
  ),
  top_users AS (
    SELECT
      user_id,
      total_events
    FROM
      user_stats
    WHERE
      session_count > (
        SELECT
          AVG(session_count)
        FROM
          user_stats
      )
  ),
  summary AS (
    SELECT
      COUNT(*) AS top_user_count,
      AVG(total_events) AS avg_events_per_top_user
    FROM
      top_users
  )
SELECT
  top_user_count,
  avg_events_per_top_user
FROM
  summary

The shape

Three cascading layers, each doing one thing. The first computes per-user statistics. The second reads that layer twice, once row-by-row and once through a scalar subquery that aggregates it to a single comparison value, and keeps the top-engagement users. The third reduces the survivors to a one-row summary. The shape mirrors the prompt's three questions: who are the users, which of them exceed the average, and what does the segment look like in aggregate.

Clause by clause

The first CTE collapses every user's sessions into a single statistics row:

WITH user_stats AS (
  SELECT user_id, COUNT(*) AS session_count, SUM(event_count) AS total_events
  FROM sessions
  GROUP BY user_id
)

COUNT(*) and SUM(event_count) are computed in the same pass. Both aggregates are needed downstream: session_count qualifies the user, and total_events is what the summary will average over.

The second CTE keeps users whose session count exceeds the per-user average:

top_users AS (
  SELECT user_id, total_events
  FROM user_stats
  WHERE session_count > (SELECT AVG(session_count) FROM user_stats)
)

The scalar subquery (SELECT AVG(session_count) FROM user_stats) reads user_stats again and reduces it to a single number. That number is then compared row-by-row against each user's session_count. The subquery is evaluated once and its result is reused on every row.

The third CTE reduces the surviving rows to a one-row summary:

summary AS (
  SELECT COUNT(*) AS top_user_count, AVG(total_events) AS avg_events_per_top_user
  FROM top_users
)

No GROUP BY is needed because there is no grouping key. Aggregating without GROUP BY collapses the whole source to one row, which is what the prompt asks for. COUNT(*) counts the segment's size; AVG(total_events) averages the carried-forward sum.

  • SELECT top_user_count, avg_events_per_top_user FROM summary returns the single row: 9 top-engagement users with an average of 77.11 total events.

The trap

The two averages in the query operate at different levels. AVG(session_count) in the second CTE is the average of per-user session counts, computed across the user table. AVG(total_events) in the third CTE is the average of per-user totals, computed across the qualifying segment. Pulling either one against the raw sessions table instead of the layer above it would produce a different number, weighted differently. The chain is what makes "average across users" mean exactly that, by giving each user one row in the layer the average is taken over.

You practiced cascading three WITH stages where the second references the first twice (row-by-row and through a scalar subquery) and the third reduces the second's row set to a one-row summary.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.