NULL: the trap that costs every team an outage
NULL is not zero, not empty string, and not false. Master this in 15 minutes and you avoid a lifetime of subtle, expensive bugs.
The bug that cost a team a Friday
A payments team I knew shipped a fraud rule: block any transaction where country != 'US' and amount > 10000. They tested it. It worked. They shipped it.
The next Friday the fraud team asked why a $50k transaction from a new customer in unknown country had gone through. The customer's country field was NULL because their KYC step had silently failed.
NULL != 'US' is not TRUE. It is not FALSE either. It is UNKNOWN. And WHERE keeps a row only when the predicate is TRUE — so the fraud rule did not match the unknown-country row, and the transaction sailed through.
Three-valued logic is the most expensive thing in SQL that nobody bothers to teach in 15 minutes. Today, we do.
NULL means unknown
It is not zero, not the empty string, and not false. The single rule that explains all the weird behavior:
Any comparison that involves NULL returns NULL, not true and not false.
This is three-valued logic: TRUE, FALSE, UNKNOWN. SQL's WHERE keeps a row only when the predicate evaluates to TRUE — so UNKNOWN rows are dropped silently. Every != and = filter you write is implicitly "and the value is known".
Why? Because two unknown values are not equal — they might or might not be the same, and we don't know. SQL refuses to guess.
The truth tables, in full
Keep these on a sticky note for a week:
AND | TRUE | FALSE | UNKNOWN
---------+----------+----------+--------
TRUE | TRUE | FALSE | UNKNOWN
FALSE | FALSE | FALSE | FALSE
UNKNOWN | UNKNOWN | FALSE | UNKNOWN
OR | TRUE | FALSE | UNKNOWN
---------+----------+----------+--------
TRUE | TRUE | TRUE | TRUE
FALSE | TRUE | FALSE | UNKNOWN
UNKNOWN | TRUE | UNKNOWN | UNKNOWN
NOT(TRUE) = FALSE
NOT(FALSE) = TRUE
NOT(UNKNOWN) = UNKNOWN ← the trap from the fraud-rule story
Three facts these tables make obvious that prose hides:
FALSE AND UNKNOWN = FALSE— false short-circuits anything. WhyWHERE never_set_column = NULL AND status = 'paid'is fine: the second clause never gets a chance to be UNKNOWN.TRUE OR UNKNOWN = TRUE— same reason, the other way.NOT UNKNOWN = UNKNOWN— the line that explains the fraud-rule bug.WHERE NOT is_subscribeddoes not keep the NULL rows.
-- Both of these return zero rows even when there ARE NULL values.
SELECT * FROM customers WHERE country = NULL;
SELECT * FROM customers WHERE country != 'US';
-- The second one drops every customer whose country is NULL.
-- That's the fraud-rule bug from the story above.
-- The right way:
SELECT * FROM customers WHERE country IS NULL;
SELECT * FROM customers WHERE country IS DISTINCT FROM 'US';
-- IS DISTINCT FROM treats NULL as a comparable value (NULL is "distinct from" 'US').
-- Coalesce when you genuinely want a default:
SELECT name, COALESCE(country, 'unknown') AS country FROM customers;
-- Joke buried in the spec: NULL = NULL is NULL (not TRUE).
-- That means SELECT * FROM t WHERE x = x quietly drops every row where x IS NULL.
-- A query that compares a column to itself is somehow not the identity function. Welcome to SQL.Aggregations skip NULL silently. AVG(score) is the average of non-NULL scores, not of all rows. If a third of your scores are NULL because the form was optional, your "average score" is biased toward the kind of user who filled in the form. Either COALESCE(score, 0) first, or report the count of non-nulls alongside the average so the bias is visible.
Here is a 5-row inline table with one **NULL** status — the case the fraud-rule team forgot. Return every row whose status is **not** `'cancelled'`, **including the NULL one**. Return `order_id` and `status`, ordered by `order_id`. Hint: a naive `!=` drops 4 rows out of 5. The right operator keeps 4 of them.
A boolean column `is_subscribed` has values `TRUE` (300 rows), `FALSE` (200 rows), and `NULL` (100 rows). What does `SELECT COUNT(*) FROM users WHERE NOT is_subscribed` return?
Takeaway: every comparison with NULL returns NULL, every WHERE with NULL silently drops the row. When the value might be unknown, use IS NULL, IS DISTINCT FROM, or COALESCE — and read your filters out loud as "AND the value is known" to feel where the trap is.