How Vipra Software built a Databricks Customer 360 platform for a retail chain with 8M customers — unifying fragmented customer data across 5 channels to power ML-driven recommendations that delivered an 18% revenue lift in the first year.
A national retail chain with 8 million loyalty programme members had a customer data problem that prevented it from competing with digitally-native retailers on personalisation. Customer interactions existed across 5 distinct channels — physical stores (POS), e-commerce, mobile app, email marketing, and a third-party loyalty programme partner — each with its own customer identifier, transaction history, and behavioural data. A customer who shopped in-store, browsed online, and clicked an email promotion was recorded as three different entities across the chain's systems.
The commercial consequence was directly quantifiable. Personalisation capability was limited to generic RFM (Recency, Frequency, Monetary) segmentation applied to loyalty card transaction history — a blunt instrument that grouped 500,000 customers into 12 segments, applied the same promotion to each segment, and hoped for conversion. Email click-through rates had declined 40% over three years as customers increasingly ignored promotions that bore no relationship to their actual preferences or purchase history.
The data science team had ambition — they had prototyped collaborative filtering recommendation models, customer lifetime value prediction, and churn propensity scoring — but all prototypes were built on samples of loyalty data only. Without a unified customer profile incorporating browsing behaviour, app interactions, and cross-channel purchase patterns, the models couldn't reach the accuracy required for production deployment.
Vipra Software's Customer 360 architecture centred on Delta Lake as the unified customer data foundation, with Databricks providing both the data engineering (ETL, identity resolution) and machine learning (feature engineering, model training, serving) workloads on a single platform — eliminating the fragmentation between data engineering and data science tooling that had slowed previous initiatives.
The architecture is built entirely on Databricks Unity Catalog, providing a single governance layer for both data engineering assets (Delta tables, notebooks, pipelines) and ML assets (experiments, models, feature tables). This unified governance model ensures that the model serving layer has the same access controls and audit trail as the underlying data — a requirement for GDPR compliance on customer data used in automated decision-making.
The recommendation model uses a two-tower neural architecture: a customer tower encodes the 180-feature customer embedding, a product tower encodes product attributes and historical interaction patterns, and the dot product similarity between embeddings produces relevance scores. The model is served in real-time from Databricks Model Serving, with a Redis cache layer storing the top-100 recommendations per customer to serve e-commerce page loads without incurring model inference latency on every request.
MLflow provides the experimentation and model lifecycle management layer, tracking 340+ model training runs across the 4 production models. Every production model deployment is linked to its training run, feature set version, and evaluation metrics — providing complete model provenance for the regulatory audit trail required under GDPR's automated decision-making provisions.
Revenue lift of 18% was measured in a controlled A/B test comparing the personalised recommendation experience against the previous generic RFM-segmented promotions — with 1M customers in each test cell running over a full quarter to capture seasonal effects. The test was designed and analysed by the client's own analytics team, independently of the implementation, to ensure unbiased measurement.
Email click-through rates recovered from a 3-year declining trend: the first personalised email campaign sent to the unified customer profiles achieved a 340% higher CTR than the previous generic campaign sent to the equivalent cohort. The difference was attributed to next-best-offer model accuracy — customers received promotions for categories they had browsed recently rather than the chain's highest-margin products regardless of relevance.
Churn prevention became an operational capability for the first time. The churn propensity model identified 280,000 customers in the high-risk cohort in the first month of operation. A targeted win-back campaign — offering personalised rewards based on the CLV model's estimate of each customer's potential future value — achieved a 23% reactivation rate among churned customers who had not transacted in 90+ days. The incremental revenue from reactivated high-CLV customers in the first quarter was £2.4M — exceeding the full programme investment within a single trading period.