Streamhub's sales team wants to identify users who have generated platform events but have never converted.
Write a query to return the user IDs of users who appear in the events table but do not appear in the conversions table.
Assumptions:
- The
eventstable records each platform event;user_ididentifies the user. - The
conversionstable records each paid conversion;user_ididentifies the user. - Streamhub's data quality is currently strong — every user generating events on the platform has converted at least once. The result set will be empty.
Output:
- One row per qualifying user, with a single column
user_id. The result set will be empty.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
user_id
FROM
events
EXCEPT
SELECT
user_id
FROM
conversions The shape
EXCEPT returns rows from the left query that don't appear in the right query. Here, every user that shows up in events also shows up in conversions, so the left set is a subset of the right and the difference is empty. The empty result is the answer the sales team needs: there are currently no event-only users to chase.
Clause by clause
SELECT user_id FROM eventsis the left input: every user who has triggered any platform event. The sales team's universe of "engaged but maybe not yet converted" users sits inside this set.EXCEPTis the set operator. PostgreSQL evaluates both queries, deduplicates each side, then returns rows from the left with no matching row in the right. When every left-side row has a counterpart on the right, the difference is empty and the result set has zero rows.SELECT user_id FROM conversionsis the right input: every user who has converted. The question being asked is whether any user in the engagement set is missing from this conversion set.- The empty result is structural. The set operator ran cleanly; the difference is simply empty.
Why this and not the swapped order
SELECT user_id FROM conversions EXCEPT SELECT user_id FROM events answers a completely different question: users with conversions but no events. That's a referential-integrity check on the data pipeline, not an engagement-without-conversion list. Swap the queries and the empty result still comes back here — because of the same data — but the meaning of the empty is different. One empty says "every engaged user has converted, no sales leads on this list." The other says "every converted user has events logged, no broken pipeline rows."
The SQL is symmetric; the questions are not. Reading the query top-down — "events EXCEPT conversions" — is the only way to be sure which question is being answered, because the empty result itself doesn't distinguish them.
You practiced EXCEPT directionality and accepted an empty result as the correct answer. The recurring rule: an empty result from EXCEPT is a meaningful finding ("the left set is a subset of the right") — not a bug.