ICML-2014 Workshop on Customer Life-Time Value Optimization in Digital Marketing

June 26, 2014, Beijing, China

In many marketing applications, a company or an organization uses technology for interacting with their end customers and making recommendations. For example, a department store might offer customers discount coupons or promotions; an online store might serve targeted "on sale now" offers; or a bank might email appropriate customers new loan or mortgage offers. Today, these marketing decisions are mainly made in a myopic (best opportunity right now) approach and optimize short-term gains. In this workshop we will explore new ways of marketing interactions for optimizing lifetime value (LTV) of customers. LTV can be thought of as long-term objectives such as revenue, customer satisfaction, or customer loyalty. These long-term objectives can be represented as the sum of an appropriate reward function. These sums of rewards can be computed through a stream of interactions between the company and each customer, including both actions from the company (e.g., promotions, advertisements, or emails) and actions by the customer (e.g., purchases, clicks on a website, or signing up for a newsletter).

In this workshop we are going to explore technology for computing interactive company strategies that maximize the sum of rewards. In particular, we will explore reinforcement learning (RL) and Markov decision processes (MDPs) - powerful paradigms for sequential decision-making under uncertainty. In the RL formulation of marketing, the agent is an algorithm that takes actions such as showing an ad and offering a promotion; the environment can be thought of as features about customer demographics, the web content and customer’s behaviors such as recency (last time the webpage was visited), frequency (how often the page has been visited), and monetary value (how much was spent so far); the reward can be thought of as the price of products purchased by the customer in response to an action taken by the marketing algorithm; and finally, the goal of the marketing agent is to maximize its long-term revenue.

Using MDPs and RL to develop algorithms for LTV marketing is still in its infancy. Most of the related work has used toy examples and appeared in marketing conferences and venues. In this workshop we will attempt to discuss the major research challenges of this problem, such as: 

  1.     Evaluating a policy off-line (without interaction with the real system and only based on historical data generated by a different policy). 

  2.     Policy visualizations.

  3.     Scaling up the computation of the LTV strategies to high dimensional "big data".

  4.     On-line versus batch algorithms.

  5.     Modeling progressively more engaging interactions, using hierarchical techniques and elicitation of the sales funnel process.

  6.     Model selection and validation from batch data.

  7.     Uncertainty estimation.

Solutions to such challenges not only will benefit marketing problems, but will benefit the RL community at large, since these are the challenges appearing in many real-world RL applications, from clinical trials to energy consumption. 

Invited Speakers

  1.     Esteban Arcaute (Walmart Labs)

  2.     Craig Boutilier (University of Toronto)

  3.     John Langford (Microsoft Research)

  4.     Shie Mannor (Technion)


  1.     Shie Mannor (Technion)

  2.    Georgios Theocharous (Adobe Research)

  3.    Mohammad Ghavamzadeh (Adobe Research & INRIA Lille)