N060-H3 Tier 5 · Expert · hard hr · Helix Systems

Return the actual count of current salary records on file

Part of Reading EXPLAIN Output in SQL

The problem

Scenario: Helix Systems' data analyst ran EXPLAIN on a salary report that needed only current salary records and saw the planner estimating 8 rows for the restriction on end_date. The analyst suspects the estimate is significantly off because the planner often underestimates how many rows have a missing value in a given column.

Task: Write a query to return the actual count of current salary records on file.

Assumptions:

  • The salaries table holds one row per salary record, with end_date storing when the record was superseded.
  • A current salary record has end_date recorded as a missing value.

Output:

  • One row, holding the count of current salary records.
  • Columns in this order: current_salary_count.
Schema · hr 4 tables
departments
id integer
name text
location text
budget numeric
salaries
id integer
employee_id integer
amount numeric
effective_date date
end_date? date
employees
id integer
name text
email text
department_id integer
manager_id? integer
hire_date date
title text
is_active boolean
job_history
id integer
employee_id integer
title text
department_id integer
start_date date
end_date? date

Run previews · Check grades

Write a query, then run it to see results here.

Worked solution Try it yourself first
Solution query
SELECT
  COUNT(*) AS current_salary_count
FROM
  salaries
WHERE
  end_date IS NULL

The shape

The planner estimated 8 rows for the end_date restriction; the actual count is 60. IS NULL selectivity is one of the categories the planner most often misjudges, because PostgreSQL's statistics track NULL frequency through a single null_frac value rather than a histogram — and when that fraction goes stale, the row-estimate gap can be an order of magnitude.

Clause by clause

  • SELECT COUNT(*) AS current_salary_count returns the total count of matching rows. COUNT(*) counts every row that survives the WHERE filter, regardless of which columns on that row are NULL.
  • FROM salaries reads the salary records — one row per salary period for each employee.
  • WHERE end_date IS NULL keeps only the rows whose end_date is missing. IS NULL is the correct test; end_date = NULL evaluates to unknown for every row and would return zero matches even on a table full of current salaries.

Why this and not WHERE end_date = NULL

= doesn't compare against NULL meaningfully. A missing value can't be tested for equality with anything — every column = NULL comparison evaluates to unknown, and WHERE keeps only rows where the condition is true. The result is silently zero, with no error to signal the mistake. The planner's estimate would still come back at 8 because the planner reads null_frac to guess NULL-match selectivity, not the literal predicate text — but the actual count would read 0, and the analyst would conclude the planner was wildly over-estimating instead of under-estimating. IS NULL is the only operator that actually compares against the absence of a value.

The trap

NULL selectivity is the single most common stale-statistics failure mode for HR-style tables where "current" is encoded as missing-end-date. The table has a clear bimodal shape: roughly half the rows (one per employee) have NULL end-dates and the other half have concrete dates — but the planner only knows the null_frac value sampled at the last ANALYZE. If that ran when the company was a third its current size, the absolute count is way off, even though the fraction is the same. The plan node will report rows=8 (or whatever the old sample suggested) and the runtime will process 60 instead. ANALYZE salaries refreshes null_frac against the current table size, and the next plan estimate moves into the right neighborhood. The query itself doesn't need to change; the statistic underneath it does.

You practiced verifying a planner estimate for missing-value matching — IS NULL selectivity is one of the categories the planner most often misjudges.

How you actually get good at SQL

Reading explains SQL. Writing it, over and over with instant feedback, is what makes you fluent.

That's the whole SQLMaxx loop: 600+ real problems, instant AI feedback, mastery you can actually see, and spaced review that won't let you forget.

A stack of SQL practice problem cards, the top card showing an employees table.
615 problems · 66 concepts

Real problems. Not toy examples.

615 hand-built problems spanning all 66 concepts, from basic SELECTs to window functions, built on real schemas and real business questions, the kind you'll actually get asked on the job. Enough reps to make SQL automatic.

A retro computer showing a SQL query marked correct with a green checkmark.
Instant AI feedback

Write a query. Know if it's right in one second.

No copying an answer and hoping it clicked. The AI grader checks your real query against real data, catches exactly what's wrong, and explains the fix in plain English, like a senior analyst reading over your shoulder on every problem.

A circular mastery progress dial filling from blue to green, the SQLMaxx diamond at its center.
Mastery tracking

Stop guessing whether you actually know it.

SQLMaxx tracks every concept and shows you what you've mastered and what's still shaky. Your skills fill in one concept at a time, so 'I think I get joins' becomes something you can prove.

A SQL query editor circled by a blue return arrow with a clock, scheduled to come back for review.
Spaced review

Learn it once. Keep it for good.

Most of what you learn this week fades by next week. So when a concept comes due for review, SQLMaxx hands you a fresh problem to solve from a blank editor, not a flashcard to re-read. A research-backed spaced-repetition algorithm (FSRS) times each return for right before you'd forget, so your SQL is still there months later, when the interview or the job actually needs it.

Practice, feedback, mastery, review. That's the loop that turns reading into real skill.

Start free

No account, no credit card. Start solving in under a minute.