| Decision Area | Recommendation | Rationale |
|---|---|---|
| Serving mode | Batch inference | Churn scores consumed nightly by CRM — no real-time SLO. Batch is cheaper, easier to debug, and matches the business consumption pattern. |
| Feature store | Shared offline store | No real-time serving required. Offline store (Feast / Hive) sufficient. Defer online store until use case requires <100ms features. |
| Retraining cadence | Weekly scheduled | Churn signal drifts monthly. Weekly retraining provides a safety margin. Trigger also on drift alert (PSI > 0.2 on key features). |
| Model registry | MLflow | Already in tech stack. Enables staged promotion, artifact lineage, and A/B experiment tracking without new tooling. |
| Shadow mode | Required first | Run new model in shadow for 2 weeks before canary. Compare score distributions — not just offline metrics. |