Data Science Thesis at Akanoo

uplift
Hallo!
Ich bin Tillmann Radmer (26), studiere an der Humboldt Universität Berlin Wirtschaftsinformatik und habe zuvor meinen Bachelor als Wirtschaftsingenieur in Hamburg gemacht. In meiner Masterarbeit erforsche und vergleiche ich neuartige Ansätze zur Optimierung von Uplift-Modellen zur interaktiven Besucheransprache.
Durch ausgeklügeltere Modelle kann Akanoo Kunden gezielter ansprechen und die Conversion Rate für seine Kunden verbessern. Ich möchte hier einen kleine Einführung in das Uplift Modelling zur gezielten Besucheransprache geben.

Uplift Modeling

Often times an important question in direct marketing is which customers to target. This is because some customers will not react to the campaign, while still incurring the variable costs of marketing. Worse, some customers might react negatively to receiving an ad. Generally, there are four types of customers in direct marketing:

  1. Customers who will respond without treatment,
  2. customers who will only respond after receiving a treatment,
  3. customers who will not respond because of a treatment, and
  4. customers who will not respond regardless.

The goal of uplift modeling is to differentiate customers in group 2 from those in the other groups.

Start with A/B-testing

To build an uplift model one needs the result from an A/B-testing experiment. Using two groups and a binary outcome will give four possible combinations as shown below.

Figure 1: Four possibilities in an A/B-test experiment. (Shaar2016)

Given the result one can calculate the uplift as

P(return|treatment)-P(return|control).

Unfortunately, this term is not defined for a single customer, so we cannot simply build a model that maximizes it. Instead there are serveral approaches to maximize it indirectly.

Building a model

The simplest one neglects the control group and models the probability that a customer will return given that she received treatment and given some customer specific information.

P(return|treatment,information)

Throwing away half the data isn’t optimal. For instance this model doesn’t predict which customers will only return when given some treatment, as we required above. Instead it will give all customers that will return regardless of treatment.

A simple extension is to build two models, one for each term in the uplift equation. Formally,

P(return|treatment,information)-P(return|control,information).

While this approach directly models the uplift effect, research suggests that it doesn’t always perform well in practice. One reason for that is that the treatment effect is usually much smaller than the main effect, e.g. if the main effect is 1.0% in the control group and the combined effect is 1.1% in the treatment group, then the uplift effect is only 0.1% (=1.1%-1.0%). Consequently the individual model will mostly focus on the main effect within each group. (Radcliffe, 2011)

An easy way to a single model approach is to apply a class variable transformation to the result from the A/B-test. Both the TR and the CNR groups will be assigned to the positive group (e.g. treatment) and TNR and CR will be assigned to the negative group (no treatment).

Table 1:  Example of the class variable transformation.

Customer

Group

1=Treatment

0=Control

Returned

1=Returned

0=Not Returned

Transformed
1 0 1 0
2 1 0 0
3 0 0 1
4 1 0 0
5 1 1 1
6 0 0 1

The idea is that from the four fields, we would definitely like to treat the customers in the group with the TR outcome. And because we don’t know whether the customers in the CNR would have responded if they had received a treatment, we would also like to treat them. It can be shown that under some assumptions modeling the conditional probability of the transformed class variable is equivalent to modeling the conditional probabilities of the two original variables from uplift equation above.

There are more elaborate techniques that incorporate the maximization of the uplift equation into the training algorithm. However, the straightforward implementation and applicability to any standard machine-learning algorithm make the two models approach and the class variable transformation approach very attractive first choices.

Business Opportunities in Click Stream Data Mining – from Cart Abandonment Prevention to Upselling

When we started Akanoo 2½ years ago, none of us would have believed that the click-stream of online shop visitors reveals so many applications of statistical analysis. Our goal was simple: use JavaScript to display a voucher to people on the first pages if you are certain they otherwise won’t buy. A logistic regression on the target variable “will buy” and a web service written in Groovy for real-time prediction did the job.

This led to roughly 10-20% of incremental revenue compared to control groups. But only on a small group of people: those that we knew for sure won’t buy.

2015-09-29-potential-by-trigger-moment
Starting with cart abandonment prevention, we have identified different strategies to make additional revenue with statistical models. What will be the next strategy? We’re working on it.

So, we took a deeper look at the behavior of the millions of online shop visitors that we had tracked so far. And we quickly realized: there is a business opportunity way bigger than that. We saw visitors responding negatively to vouchers and some of them becoming less profitable in the long run compared to the control group. So we had to rethink our initial solution.

Let’s draw an analogy here: most online retailers offer the same assistance as a supermarket – none. However, there are millions of visitors out there that rather like the shopping experience of a fashion outlet or shoe retailer (they know from the offline world). There is staff around that guides you through the offering. How can we deliver both at the same time?

The solution is: create algorithms that have a lot more freedom. We started working on algorithms that are able to predict the intention and next steps of visitors and pick the right way of interacting with these visitors from an array of different incentives and pieces of information. With the ultimate goal: converting visitors into happy customers.

Now we’ve expanded the audience of our JavaScript targeting to a wide array of scenarios besides cart abandonment prevention: up-selling, inspiration, return reduction, loyalty. (And we started using Spark, Hadoop, Docker, Redis and Akka to handle the data…)

What’s next? The data we analyze and track on a daily basis is growing steadily. Just for the up-selling models, we look at the data of 11M users every night. Other kinds of transaction-based websites come into play and modeling of more complex decision making processes.

Who makes this possible? The data science team behind Akanoo.
Yours, Fabian