Senior data scientist at a fintech (credit risk)
A mid-level data scientist who built and deployed a credit risk scorecard and an uplift model for offer targeting. The bullets that land all share the same texture: Gini with a named baseline, a validation protocol that caught leakage before it shipped, and an A/B with a pre-registered hypothesis. The weak bullets are the ones that appear on every data science resume and say nothing about the work.
Maya Chen
Education
Experience
- Built a LightGBM credit risk scorecard on 38M loan applications (180 engineered features, 3-year look-back window); Gini improved from 0.61 (prior logistic regression) to 0.74 on a 6-month out-of-time holdout; deployed to daily scoring across 2.1M active accounts; approved loan volume rose 18% at the same observed default rate over a 90-day post-deployment window.
- Designed the team's first time-series holdout protocol: 18-month training window, 30-day exclusion gap, 6-month OOT period; the protocol surfaced target leakage in a prior model that had inflated Gini by 9 points, and the fix shipped before that model reached production.
- Used Python and machine learning to build credit models.
- Built an S-learner uplift model for loan offer targeting on 14k labeled response observations; net uplift in the persuadable decile was 8.4 pp over the no-offer control in a 6-week A/B (n=46k, p<0.001 at 80% power, pre-registered primary metric); CAC dropped 34% in the treatment cohort.
- Worked with stakeholders to understand business requirements and define model success criteria.
- Engineered the feature store for real-time credit scoring: 180 behavioral features computed in a nightly batch and served via Redis; feature-fetch p99 dropped from 88ms to 11ms; removed 3 redundant external API calls from the real-time scoring path.
- Built a fraud detection ensemble (isolation forest + XGBoost, 22M transactions/month); precision at recall=0.90 improved from 0.43 (prior rule-based system) to 0.71 on a 3-month temporal holdout; false-positive rate cut from 2.8% to 0.9%, saving an estimated $1.1M/year in manual review cost.
- Ran the team's first A/B-validated feature selection experiment across 40 candidate features for the fraud model: SHAP importance plus pairwise correlation filter selected 18 features; OOT Gini improved 2.4 points on the leaner set with no increase in model complexity.
- Performed EDA and feature engineering for the credit and fraud model pipelines.
- Set up the team's MLflow experiment tracker and model registry; eliminated 'which version is in production' ambiguity and tracked 14 months of active experiments across 6 concurrent models.
Technical Skills
Credit risk bullets live or die on the baseline. Gini 0.74 is a number; Gini 0.74 vs 0.61 for the prior logistic regression on a 6-month out-of-time holdout is a defensible claim. The second form is what a senior reviewer reads on every good resume and almost never finds.
