Tests, incremental models, and the dbt run that didn't break
The four generic tests you'll use 95% of the time, and the incremental pattern that turns nightly full-refresh into a 30-second job.
version: 2
models:
- name: fct_revenue_daily
description: One row per (day, country). Grain stated; tested.
columns:
- name: day
tests:
- not_null
- name: country
tests:
- not_null
- accepted_values:
values: ['US', 'GB', 'FI', 'NL', 'DE', 'JP']
- name: revenue_cents
tests:
- not_null
tests:
# Composite uniqueness: (day, country) must be unique together.
- dbt_utils.unique_combination_of_columns:
combination_of_columns: [day, country]{{ config(
materialized='incremental',
unique_key='order_id',
on_schema_change='append_new_columns'
) }}
SELECT *
FROM {{ source('oltp', 'orders') }}
{% if is_incremental() %}
-- Only pull rows updated since last run, with a 1-hour overlap to catch late writes.
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }}) - INTERVAL '1 hour'
{% endif %}The incremental trap that doubles your numbers.
Missing unique_key means dbt does an INSERT, not a MERGE. Late-arriving updates create duplicate rows. Your daily revenue silently doubles for the affected window.
Always set unique_key on incremental models. Always test unique_combination_of_columns on the natural grain. The two together prevent the duplication; one without the other is theatre.
Implement the *grain test* manually — what `unique_combination_of_columns` would compile to. Return one row, one column `is_unique`, with value `true` if `(order_id)` is unique across all `orders` (it should be — it's the primary key).
Takeaway: tests are not optional. Generic YAML tests catch 95% of bugs (not_null, unique, relationships, accepted_values). Incremental models need unique_key and a composite-uniqueness test. The combination is what turns a nightly job from "hope it works" to "watched, tested, alerted".