Deep SQL & Production Data Engineering — forum
Deep SQL — General
2 threadsMindset of a Data Engineer — What a Data Engineer actually does
0 threadsNo threads yet.
Mindset of a Data Engineer — ETL vs ELT, lakehouses and the modern data stack
0 threadsNo threads yet.
Mindset of a Data Engineer — Spin up Postgres locally — in five minutes
0 threadsNo threads yet.
SQL Core and the relational model — Tables, keys and grain — or how I shipped a 30% revenue bug
1 threadsSQL Core and the relational model — NULL: the trap that costs every team an outage
1 threadsSQL Core and the relational model — The SELECT pipeline: order of operations
1 threadsJOINs and join algorithms — JOIN types — and the 47 customers nobody noticed were missing
1 threadsJOINs and join algorithms — The Black Friday +800% revenue bug — and the diagnostic playbook
1 threadsJOINs and join algorithms — How JOINs run: nested loop, hash, merge — and the ANALYZE you forgot
0 threadsNo threads yet.
Aggregations, window functions and analytical SQL — GROUP BY: COUNT, FILTER, and the dashboard that lied for a year
1 threadsAggregations, window functions and analytical SQL — Window functions: ROW_NUMBER, LAG, and the running total that took 9 hours
1 threadsAggregations, window functions and analytical SQL — Retention cohorts and conversion funnels — pure SQL
0 threadsNo threads yet.
Schema design, DDL and the migrations that almost killed us — Choosing types: the $0.07 that broke a finance team
0 threadsNo threads yet.
Schema design, DDL and the migrations that almost killed us — Online migrations: adding NOT NULL to a 200M-row table without taking the site down
0 threadsNo threads yet.
Indexes and physical storage — when fewer indexes is the right answer — B-tree, GIN, BRIN and the field guide to picking one
1 threadsQuery optimizer and reading plans — guessing is how regressions ship — EXPLAIN ANALYZE in 15 minutes — the only intro you'll ever need
1 threadsTransactions, MVCC, locks — and the write skew that paid the same bonus twice — Isolation levels — the bonus that was paid twice
1 threadsTransactions, MVCC, locks — and the write skew that paid the same bonus twice — Idempotent UPSERT and the load that you can safely run twice
0 threadsNo threads yet.
Data modeling for analytics — the star schema and the dimension that didn't change — Star schema in one lesson — and why dashboards run 100× faster on it
0 threadsNo threads yet.
Data modeling for analytics — the star schema and the dimension that didn't change — Slowly Changing Dimensions: SCD2 — keeping history without overwriting it
0 threadsNo threads yet.
dbt and SQL as engineering code — Project structure: the staging→intermediate→marts pattern
0 threadsNo threads yet.
dbt and SQL as engineering code — Tests, incremental models, and the dbt run that didn't break
0 threadsNo threads yet.
Data quality and data contracts — the column rename that didn't take down prod — The six quality dimensions — vocabulary that ends taste arguments
0 threadsNo threads yet.
Data quality and data contracts — the column rename that didn't take down prod — Data contracts — the column rename that didn't take down prod
0 threadsNo threads yet.
Airflow and pipeline orchestration — the backfill that didn't double the metrics — Your first DAG: extract → load → dbt → publish → notify
0 threadsNo threads yet.
Database Engineering Lab I: Postgres, DuckDB, ClickHouse — Three engines, one dataset — and the wrong choice that cost a quarter
0 threadsNo threads yet.
Cloud warehouses: BigQuery and Snowflake — the bills that bite — BigQuery vs Snowflake — and the SELECT * that filled a quarterly bill
0 threadsNo threads yet.
Spark SQL, Databricks and Delta Lake — when one node isn't enough — Shuffle, skew, and the small-files problem
0 threadsNo threads yet.
Apache Iceberg and Trino — open table formats and federated SQL — Iceberg in one lesson — the year someone changed history
0 threadsNo threads yet.
Non-relational sources: Mongo, Elasticsearch, Vector DB — Ingesting semi-structured data without losing your sanity
0 threadsNo threads yet.
Incremental processing, CDC and late data — and the watermark that missed an update — Watermarks, overlap windows, and the missed updates that ate a Tuesday
1 threadsData observability, lineage and governance — knowing before Slack tells you — Designing alerts that fire when they should — and only when they should
1 threadsSecurity, privacy and access — least privilege as the default — RBAC and a masked-view pattern — and the breach that wasn't
1 threadsSQL style, maintainability and code review — the 15-question checklist — The 15-question SQL review checklist — and the smells it catches
0 threadsNo threads yet.
Performance engineering: from query to system — the eight levers — The eight levers of analytical performance — ranked by ROI
0 threadsNo threads yet.
Capstone: production-grade data platform — Capstone brief: what to build, how to defend it
0 threadsNo threads yet.