The client is a global media, creative, and technology partner that specializes in growing brands in all retail environments. From consulting and creative content to media buying and analytics, the client helps brands uncover insights into their customer’s shopping behaviours.
Design a scalable solution that would allow advertisers to build audiences on offline behavior that they could target through online Ad platforms.
The data was big in terms of volume and velocity. Participating mobile users would have each visit to a POI (Point of Interest) recorded. This translates to billions of rows of data per week.
End-to-end data discovery sessions with stakeholders to understand the data challenges and business requirements.
Databricks has embedded and optimized Spark as part of a larger platform designed for not only data processing, but also data science, machine learning, and business analytics.
Data was brought in from US Census for the geographies of retailer locations to augment the visitation data.
Exhaustive audit of all Visitation data for ~50 retailers of interest were conducted.
Extensive EDA done on user, retailer, time and geo-based features to identify features with best predictive power.
Developed 20 principle features that indicated associations between store visit pattern of users and the associated brands.
Three different classification techniques were evaluated, keeping in mind future scalability for different states. The winning algorithm was validated out of sample for years 2019 and 2020 till date.
On an average, the model is correct ~72% of the time when claiming that a user is going to visit a retailer within the next 10 days.
Weekly jobs ingest Visitation data into an Azure storage, the data is then the fed to the classification model that runs on Databricks. The model then generates an audience list of probable visitors for each retailer.
AppNexus provides Audience Data Service to integrate first, second- and third-party data via an API. This service is called Batch Server-Side Segmentation and it automatically matches audience lists with AppNexus ids and allows to build advanced audience segments. These segments are further augmented with look-alike audiences to scale campaigns across a wider spectrum.
This API enables seamless automation of the Machine Learning output – visitor ids are pushed to AppNexus and ready-to-use audience segments become available to the programmatic buying team of our client.
Generating audiences for brands that was used for delivering highly relevant and geo-specific advertisements.
Clusters of retailers based on their geographic location was created with relevance to brand and audiences. Using these clusters, the client was able to allocate online advertising budget to different geo locations.
Creation of custom segments that can be empowered alongside a retailer’s first-party data.