Streamhub's analytics team is building a unified activity feed combining session starts and conversion events.
Write a query to return each user ID alongside a label indicating which source the row came from:
'session'for rows from thesessionstable.'conversion'for rows from theconversionstable.
Assumptions:
- Both queries must produce two columns — the user ID and a string-literal source label — in the same column positions.
- Duplicates are preserved (a user with three sessions and two conversions contributes five rows).
Output:
- One row per source record across both tables, with columns
user_idandactivity_type.
Schema · analytics 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
user_id,
'session' AS activity_type
FROM
sessions
UNION ALL
SELECT
user_id,
'conversion' AS activity_type
FROM
conversions The shape
A string literal in each query's SELECT list tags every row with the table it came from. After UNION ALL stacks the two sides, every row in the combined feed carries 'session' or 'conversion' in the activity_type column, so the analytics team can read the row's origin without going back to the source tables.
Clause by clause
SELECT user_id, 'session' AS activity_type FROM sessionsis the left input. The literal'session'doesn't depend on the row, so the column is effectively a constant tag on every session row. TheAS activity_typealias names the column; the output column name comes from the left query, so this is the alias that survives into the combined result.UNION ALLstacks the two result sets without deduplication. Every session row appears, then every conversion row, and rows that share the sameuser_idacross the two sides both appear — which is what an activity feed needs.SELECT user_id, 'conversion' AS activity_type FROM conversionsis the right input. The literal'conversion'tags every row on this side. The alias here is ignored — only the column position matters for the set operation.- The two queries project the same column count (two) and compatible types (integer + text on both sides), which is what allows the set operation to run.
Why this and not run two queries
The team could run SELECT user_id FROM sessions and SELECT user_id FROM conversions separately and add the source label in the reporting layer. That pushes the provenance logic out of the query and into application code, where it has to be re-applied every time the feed is rebuilt. Tagging at the source makes the combined result self-describing: any consumer can read activity_type and know which table the row came from.
You practiced adding a constant column to each side of a UNION ALL to tag source provenance. The recurring shape any time multiple structurally similar tables feed a unified report — the literal labels make the row's origin explicit.