Streamhub's conversion team wants to know which users have both started at least one session and made at least one conversion.
Write a query to return the user IDs that appear in both the sessions table and the conversions table.
Assumptions:
- The
sessionstable records each session;user_ididentifies the user. - The
conversionstable records each paid conversion;user_ididentifies the user. - A user appears in the result exactly once, regardless of how many sessions or conversions they have.
Output:
- One row per user who appears in both tables, with a single column
user_id.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
user_id
FROM
sessions
INTERSECT
SELECT
user_id
FROM
conversions The shape
INTERSECT returns only rows that appear in both result sets, after deduplication. The output is the list of users who show up in sessions and also show up in conversions — the conversion team's session-plus-conversion overlap, computed directly as a set intersection.
Clause by clause
SELECT user_id FROM sessionsis the left input. Every row insessionsprojects its user. A user with many sessions contributes many rows here, butINTERSECTwill deduplicate before comparing, so the multiplicity on each side does not change the output.INTERSECTis the set operator. PostgreSQL evaluates each query independently, deduplicates each side, then returns rows that are present in both. The comparison is on the full row across both inputs — for a single-column query like this one, that means theuser_idvalues themselves.SELECT user_id FROM conversionsis the right input. Same column, same type. A user with multiple conversions contributes multiple rows here too, but they collapse to one before the intersection runs.- The prompt asks for each qualifying user exactly once, which is the default shape
INTERSECTproduces. No extra step is needed to remove duplicates.
Why this and not UNION
UNION would combine the two queries and return every user who appears in either — anyone with a session, anyone with a conversion, deduplicated. That's the wrong question. The conversion team wants the and, not the or: users who are in sessions and in conversions simultaneously. UNION answers "appears in at least one"; INTERSECT answers "appears in both." The difference between the operators is the difference between the two questions, and picking the wrong one produces a much larger result set that looks plausible until someone checks the row count against either input.
You practiced INTERSECT for set-intersection logic. The recurring rule: INTERSECT returns rows present in both inputs after deduplication — equivalent to a JOIN on the matching columns followed by a DISTINCT, but more readable when set-membership is the question.