Brightlane's data completeness audit needs every product represented in a per-category export, regardless of whether a 'color' attribute is recorded.
Write a query to return every category ID and a JSON array of objects — one per product in that category — where each object contains the product's id under the key 'id' and the product's color attribute under the key 'color'.
Assumptions:
- Each
category_idwith at least one product should appear once. - For each category, the array contains one object per product (no de-duplication). Each object has exactly two keys,
'id'and'color'. Products with a'color'key inattributespopulate that key with the text color value; products without one populate that key with JSON null. - The order of objects within each array is unspecified.
Output:
- One row per category, with columns
category_idandproduct_colors.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
category_id,
JSONB_AGG(
JSON_BUILD_OBJECT('id', id, 'color', attributes ->> 'color')
) AS product_colors
FROM
products
GROUP BY
category_id The shape
The audit needs every product represented, even ones with no 'color' on record. Wrapping the per-row build in json_build_object('id', id, 'color', attributes->>'color') is what makes that work: json_build_object writes JSON null for any NULL argument but always emits the row's object, so every product produces one element in the outer jsonb_agg array.
Clause by clause
SELECT category_id, jsonb_agg(json_build_object('id', id, 'color', attributes->>'color')) AS product_colorsbuilds a two-key object for each product row, then aggregates the objects per category. The'id'value is always present. The'color'value comes fromattributes->>'color', which returns NULL when the key is absent —json_build_objectwrites that out as JSON null, keeping the key in place. The full object is never NULL, sojsonb_aggnever drops a row.FROM productsreads the product records.GROUP BY category_idpartitions by category. Category4returns six objects, each with the product'sidandcolor: null, because none of the books have a'color'key. Category5returns six objects, each with a real color value.
The trap
The same audit written as jsonb_agg(attributes->>'color') would drop every key from the output and surface only the color values, which is the wrong shape. But the more dangerous near-miss is jsonb_agg(CASE WHEN attributes ? 'color' THEN json_build_object('id', id, 'color', attributes->>'color') END) — wrapping the build in a CASE that returns NULL when the color key is missing. The CASE returns NULL, the outer aggregate sees NULL, and jsonb_agg skips that row entirely. Whole products vanish from the audit. The fix is the one shown in the canonical query: keep json_build_object outside any NULL-producing branch, let it emit color: null for the missing rows, and rely on jsonb_agg keeping the row because the object itself is not NULL.
You practiced jsonb_agg(json_build_object(..., 'col', maybe_null)) — json_build_object writes JSON null for null arguments, and jsonb_agg includes the row, so every record appears in the array regardless of whether the JSONB-extracted value is missing.