Brightlane's product analytics pipeline caches per-category metrics in a temp table to support multiple catalog reports in the same session. The query that populates the temp table needs to return the product count and average price for each category_id.
Write a query to return the category ID, product count, and average price for each category_id value.
Assumptions:
- The
productstable has one row per product with acategory_idand aprice. - Each
category_idvalue present inproductsshould appear once in the result. - For each category, the product count is the number of products in that
category_id. The average price is the average ofpriceacross those products.
Output:
- One row per category, with columns
category_id,product_count, andavg_price.
Schema · ecommerce 5 tables
Run previews · Check grades
Write a query, then run it to see results here.
Worked solution Try it yourself first
SELECT
category_id,
COUNT(*) AS product_count,
AVG(price) AS avg_price
FROM
products
GROUP BY
category_id The shape
A per-category aggregation: group the products table by category_id, then count the rows and average the prices inside each group. The result is one row per category with two metrics already computed, which is the shape worth caching in a temp table when multiple reports need the same numbers.
Clause by clause
SELECT category_id, COUNT(*) AS product_count, AVG(price) AS avg_pricereturns three columns.category_idlabels each group;COUNT(*)counts the products inside that group;AVG(price)averages the prices in that group. So category 8 reports five products at an average price of 72.99.FROM productsreads every product in the catalog. No filter is applied; every row contributes to some group.GROUP BY category_idpartitions the rows by category. The aggregates run once per partition rather than once across the entire table. Thecategory_id = NULLrows form their own group, becauseGROUP BYtreatsNULLas a value for grouping purposes even though it is not equal to itself.
The trap
AVG is not the same as SUM / COUNT once NULL enters the picture. AVG(price) skips rows where price is NULL entirely. If half of a category's products had a missing price, the average would be over the populated half, not divided by the full product count. On this data every price is recorded, so the two formulas agree. The moment a NULL price exists, AVG(price) and SUM(price) / COUNT(*) diverge.
You practiced a per-category aggregation as the body of a CTAS — the materialized result becomes a queryable table with three columns inferred directly from the SELECT list.