Senior data engineer (batch ETL + warehouse migration)
A senior data engineer whose work is mostly batch: ingestion, transformation, and a warehouse migration shipped end to end. The bullets that land all carry the same three things: a throughput or cost number, the baseline it beat, and the SLA it held to. The bullet that slips is the parts-list bullet, which gets rewritten in front of you.
Priya Nair
Education
Experience
- Led the analytics warehouse migration from Redshift to Snowflake across 340 production tables and 1,100 downstream models; cut median cost per TB scanned from $4.10 to $1.35 and p95 dashboard load from 14s to 3.2s, with zero data-parity defects across a 3-week dual-run reconciliation.
- Rebuilt the nightly ETL on Spark and Airflow processing 2.4 TB/day across 90 sources; cut end-to-end runtime from 6h10m to 1h40m by repartitioning on the join key and converting 12 wide shuffles to broadcast joins, holding the 6am freshness SLA at 99.7% over four quarters.
- Designed the incremental-load framework (Debezium CDC plus merge-on-read) that replaced 40 full-table reloads; cut warehouse compute spend by $32k/month and reduced source-to-warehouse latency from 24h to 35 min.
- Built scalable data pipelines using Spark, Airflow, and Snowflake.
- Authored the data-quality suite (Great Expectations, 280 checks across 60 critical tables) wired into the Airflow DAGs; gated 100% of warehouse promotions and caught a revenue-reporting drift that had understated bookings by 4.2% for two weeks.
- Owned the data-platform on-call rotation across three quarters and 52 incidents; median time to detect 5 min via freshness and volume anomaly alerts, median time to mitigate 28 min; wrote the pipeline-recovery runbook now used by two adjacent teams.
- Built the company's first dbt project (180 models, 3 marts) on BigQuery; cut the analytics team's median time-to-new-metric from 5 days to 6 hours and removed 14k lines of copy-pasted SQL from the BI layer.
- Worked closely with analysts and stakeholders to deliver data.
- Replaced a hand-rolled cron ingestion with Airflow; cut failed-load recovery from a manual 2h to an 8 min automated retry and raised pipeline success rate from 91% to 99.4% across 60 daily loads.
Technical Skills
Batch data engineer resumes do not win on the tool stack. They win on volume processed, cost per TB, the freshness SLA held, and one migration shipped with a reconciliation behind it. Naming Spark and Airflow without a number is the most skippable line on the page.
