Sentrana Research Provides New Way to Fill-in Data Gaps to Drive Penetration for Foodservice Distributors
Dr. Wang's talk entitled "Learning from Only Positive and Unlabeled Data Using Generalized Stochastic Model" focuses on using classification with only positive and unlabeled data and the stochastic frontier model to solve the business challenge of predicting whether a customer needs certain items that they have not purchased from a specific distributor when only data from that distributor is available.
Since many customers in the foodservice industry are served by multiple distributors and vendors, when working with any one distributor only information about what customers are purchasing from that distributor is available. In order to identify which items are good choices to issue coupons to induce existing customers to purchase new items, Dr. Wang developed a method to predict what items the customers purchase from other vendors. Dr. Wang’s work extended the stochastic frontier model to address classification with only positive and unlabeled data. Within the data, it is clear that a customer needs specific items (a positive label) if they purchased the items before. However, if they haven’t purchased the product from the distributor then there are two possibilities: they don’t need that item (a negative label) or they get that item from elsewhere (a positive label). Therefore any items not purchased before are marked as unlabeled. Dr. Wang predicted the true purchase probability of all the items marked as unlabeled in order to assign positive and negative labels to those items. This addressed the marketing problem of successfully predicting which items a customer needs that they are not currently purchasing from the distributor whose data is being used. Using this technique, Sentrana has successfully improved customer penetration by allowing clients to better target their penetration offers around items their customers need, with a result of increased client wallet share within their customers.
“Even in a world that has more and more data available every day, there are limitations to the amount of data that is accessible to solve any one problem. This recent development in modeling enables us to fill-in the crucial information needed by our clients to extend their business,” said Dr. Wang. “Without this sophisticated analysis, our clients would only be able to guess at new items their customers need and waste time and money with blanket penetration attempts. We are able to greatly increase their penetration success using this method of extending the stochastic frontier model to non-Gaussian data.”
This new method is optimal for addressing situations with limited data and can be applied to similar challenges in any distribution or vendor centered industry.
To learn more, please download Dr. Wang’s presentation here: http://sentrana.com/downloads/Sentrana_LWang_PU_Classification_08_02_2012.pdf
About the Joint Statistical Meetings
The Joint Statistical Meetings (JSM) are the largest gathering of statisticians held in North America. These meetings provide an opportunity for more than 5000 leading scientists, researchers, and interested individuals to come together for a comprehensive program that address the application of statistics to various real-world problems. The JSM is held jointly with the American Statistical Association, the International Biometric Society (ENAR and WNAR), the Institute of Mathematical Statistics, the Statistical Society of Canada, the International Chinese Statistical Association, and the International Indian Statistical Association. Attended by over 5000 people, the meetings include oral presentations, panel sessions, poster presentations, continuing education courses, exhibit hall, career placement service, society and section business meetings, committee meetings, social activities, and networking opportunities.