Codabra

Deep SQL & Production Data Engineering — forum

Deep SQL — General

2 threads

Mindset of a Data Engineer — What a Data Engineer actually does

0 threads

No threads yet.

Mindset of a Data Engineer — ETL vs ELT, lakehouses and the modern data stack

0 threads

No threads yet.

Mindset of a Data Engineer — Spin up Postgres locally — in five minutes

0 threads

No threads yet.

SQL Core and the relational model — Tables, keys and grain — or how I shipped a 30% revenue bug

1 threads

SQL Core and the relational model — NULL: the trap that costs every team an outage

1 threads

SQL Core and the relational model — The SELECT pipeline: order of operations

1 threads

JOINs and join algorithms — JOIN types — and the 47 customers nobody noticed were missing

1 threads

JOINs and join algorithms — The Black Friday +800% revenue bug — and the diagnostic playbook

1 threads

JOINs and join algorithms — How JOINs run: nested loop, hash, merge — and the ANALYZE you forgot

0 threads

No threads yet.

Aggregations, window functions and analytical SQL — GROUP BY: COUNT, FILTER, and the dashboard that lied for a year

1 threads

Aggregations, window functions and analytical SQL — Window functions: ROW_NUMBER, LAG, and the running total that took 9 hours

1 threads

Aggregations, window functions and analytical SQL — Retention cohorts and conversion funnels — pure SQL

0 threads

No threads yet.

Schema design, DDL and the migrations that almost killed us — Choosing types: the $0.07 that broke a finance team

0 threads

No threads yet.

Schema design, DDL and the migrations that almost killed us — Online migrations: adding NOT NULL to a 200M-row table without taking the site down

0 threads

No threads yet.

Indexes and physical storage — when fewer indexes is the right answer — B-tree, GIN, BRIN and the field guide to picking one

1 threads

Query optimizer and reading plans — guessing is how regressions ship — EXPLAIN ANALYZE in 15 minutes — the only intro you'll ever need

1 threads

Transactions, MVCC, locks — and the write skew that paid the same bonus twice — Isolation levels — the bonus that was paid twice

1 threads

Transactions, MVCC, locks — and the write skew that paid the same bonus twice — Idempotent UPSERT and the load that you can safely run twice

0 threads

No threads yet.

Data modeling for analytics — the star schema and the dimension that didn't change — Star schema in one lesson — and why dashboards run 100× faster on it

0 threads

No threads yet.

Data modeling for analytics — the star schema and the dimension that didn't change — Slowly Changing Dimensions: SCD2 — keeping history without overwriting it

0 threads

No threads yet.

dbt and SQL as engineering code — Project structure: the staging→intermediate→marts pattern

0 threads

No threads yet.

dbt and SQL as engineering code — Tests, incremental models, and the dbt run that didn't break

0 threads

No threads yet.

Data quality and data contracts — the column rename that didn't take down prod — The six quality dimensions — vocabulary that ends taste arguments

0 threads

No threads yet.

Data quality and data contracts — the column rename that didn't take down prod — Data contracts — the column rename that didn't take down prod

0 threads

No threads yet.

Airflow and pipeline orchestration — the backfill that didn't double the metrics — Your first DAG: extract → load → dbt → publish → notify

0 threads

No threads yet.

Database Engineering Lab I: Postgres, DuckDB, ClickHouse — Three engines, one dataset — and the wrong choice that cost a quarter

0 threads

No threads yet.

Cloud warehouses: BigQuery and Snowflake — the bills that bite — BigQuery vs Snowflake — and the SELECT * that filled a quarterly bill

0 threads

No threads yet.

Spark SQL, Databricks and Delta Lake — when one node isn't enough — Shuffle, skew, and the small-files problem

0 threads

No threads yet.

Apache Iceberg and Trino — open table formats and federated SQL — Iceberg in one lesson — the year someone changed history

0 threads

No threads yet.

Non-relational sources: Mongo, Elasticsearch, Vector DB — Ingesting semi-structured data without losing your sanity

0 threads

No threads yet.

Incremental processing, CDC and late data — and the watermark that missed an update — Watermarks, overlap windows, and the missed updates that ate a Tuesday

1 threads

Data observability, lineage and governance — knowing before Slack tells you — Designing alerts that fire when they should — and only when they should

1 threads

Security, privacy and access — least privilege as the default — RBAC and a masked-view pattern — and the breach that wasn't

1 threads

SQL style, maintainability and code review — the 15-question checklist — The 15-question SQL review checklist — and the smells it catches

0 threads

No threads yet.

Performance engineering: from query to system — the eight levers — The eight levers of analytical performance — ranked by ROI

0 threads

No threads yet.

Capstone: production-grade data platform — Capstone brief: what to build, how to defend it

0 threads

No threads yet.