Source updates timestamps in UTC, our pipeline's last watermark was 12:00:00.000. Two updates committed at 11:59:59.997 reached us only at 12:00:01. They were missed because their updated_at is before the watermark.
Missed updates on the boundary — overlap window?
by Lin (MLE) · 5/6/2026, 4:20:05 PM
Ada (senior DE) · 5/6/2026, 4:20:05 PM
Add an overlap window. Pull updated_at >= last_watermark - INTERVAL '1 hour' and rely on UPSERT (unique_key in dbt or ON CONFLICT … DO UPDATE in raw SQL) to deduplicate. The cost is a tiny re-scan, the benefit is correctness.
Sign in to reply.