Activity Feed with Source Labels — UNION in SQL

The problem

Streamhub's analytics team is building a unified activity feed combining session starts and conversion events.

Write a query to return each user ID alongside a label indicating which source the row came from:

'session' for rows from the sessions table.
'conversion' for rows from the conversions table.

Assumptions:

Both queries must produce two columns — the user ID and a string-literal source label — in the same column positions.
Duplicates are preserved (a user with three sessions and two conversions contributes five rows).

Output:

One row per source record across both tables, with columns user_id and activity_type.

Schema · analytics 5 tables

users

id integer

name text

email text

country text

plan text

signed_up_at timestamptz

is_active boolean

conversions

id integer

user_id integer

converted_at timestamptz

plan text

amount numeric

sessions

id integer

user_id integer

started_at timestamptz

ended_at? timestamptz

event_count integer

events

id integer

user_id integer

session_id? integer

event_type text

occurred_at timestamptz

properties? jsonb

periods

id integer

name text

start_month integer

end_month integer

Check answerShift Ctrl ↵

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first

Solution query

SELECT
  user_id,
  'session' AS activity_type
FROM
  sessions
UNION ALL
SELECT
  user_id,
  'conversion' AS activity_type
FROM
  conversions

The shape

A string literal in each query's SELECT list tags every row with the table it came from. After UNION ALL stacks the two sides, every row in the combined feed carries 'session' or 'conversion' in the activity_type column, so the analytics team can read the row's origin without going back to the source tables.

Clause by clause

SELECT user_id, 'session' AS activity_type FROM sessions is the left input. The literal 'session' doesn't depend on the row, so the column is effectively a constant tag on every session row. The AS activity_type alias names the column; the output column name comes from the left query, so this is the alias that survives into the combined result.
UNION ALL stacks the two result sets without deduplication. Every session row appears, then every conversion row, and rows that share the same user_id across the two sides both appear — which is what an activity feed needs.
SELECT user_id, 'conversion' AS activity_type FROM conversions is the right input. The literal 'conversion' tags every row on this side. The alias here is ignored — only the column position matters for the set operation.
The two queries project the same column count (two) and compatible types (integer + text on both sides), which is what allows the set operation to run.

Why this and not run two queries

The team could run SELECT user_id FROM sessions and SELECT user_id FROM conversions separately and add the source label in the reporting layer. That pushes the provenance logic out of the query and into application code, where it has to be re-applied every time the feed is rebuilt. Tagging at the source makes the combined result self-describing: any consumer can read activity_type and know which table the row came from.

You practiced adding a constant column to each side of a UNION ALL to tag source provenance. The recurring shape any time multiple structurally similar tables feed a unified report — the literal labels make the row's origin explicit.

- `'session'` for rows from the `sessions` table. - `'conversion'` for rows from the `conversions` table

The shape

Clause by clause

Why this and not run two queries

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.