## Meet our new Data Scientist Torge

Torge is supporting the Data Science team since December. To give you an impression of him and his responsibilities we asked him to answer the Akanoo team member questionnaire:

What have you done before joining us?

I worked for a few years at Hamburg University of Technology as a research assistant where I wrote my Phd thesis in mathematics.

What are your responsibilities at Akanoo?

I am a Data Scientist and as such it is my responsibility to answer data-related questions and develop/improve our statistical model.

Besides having great colleagues I enjoy being able to apply my knowledge as a mathematician.

And what do you usually do after work?

After work I usually take a long trip to Stade, where I currently live.

Which colleague would you take with you to a lonely island and why?

Sven, since he is the oldest and wisest of us all.

Which attraction/part of Hamburg do you think you should definitely visit and why?

The new Elbphilharmonie, since I am anxious to see what the acoustics are like.

Which apps could you never do without and why?

Threema and Netflix. I use Threema to communicate and Netflix as a remote for the TV.

These blogs/websites belong to my daily reading:

Currently Reddit.

## Christmas shoppers behaviour differs in online shop industry

The closer Christmas is coming, the more people shop online. We don’t share a secret telling you that Christmas is the most lucrative time of the year. We asked how many additional visitors and revenue online shops really generate during the Christmas season.

### Traffic increase for all branches

Our Akanoo Data Insights tool provides relevant benchmarks for online retailers to evaluate the performance of their shops and enables us, to dive deep into the Christmas season and the performances of different branches. A comparison of the yearly average with the figures of December clearly shows that many online shops have a significantly higher amount of shop visitors. On average, online shops show a traffic increase of 123.0%.

Thus, a closer look at different industries like fashion, DIY and consumer goods shows that the traffic increase differs between the different portfolios. In the fashion segment the traffic amount climbs up by 97.1 %. DIY shops show a rise of 112.2% in visits. Compared to fashion and DIY can shops increase their visits with consumer goods by remarkable additional 169.4% visitors.

### Conversion increase differs massively

While higher traffic is often going with weaker conversion rates, the additional Christmas traffic shows even a higher conversion rate. On average, the conversion rate increases by 15.4 % during the Christmas season. Comparing fashion, DIY and consumption we recognize the following: Fashion online shops create 4% more conversions in the Christmas season. Shops which offer consumer goods show an increase of 8%. DIY shops are one of the segments which realise double-digit growth of 14.3 % in conversions.

Our benchmark shows the enormous potential for online shops to generate additional revenue. In order to make the best use of the Christmas traffic, it is necessary to evaluate permanently the shop performance and to compare it to industry and product benchmarks.

## Data Science Thesis at Akanoo

Hallo!
Ich bin Tillmann Radmer (26), studiere an der Humboldt Universität Berlin Wirtschaftsinformatik und habe zuvor meinen Bachelor als Wirtschaftsingenieur in Hamburg gemacht. In meiner Masterarbeit erforsche und vergleiche ich neuartige Ansätze zur Optimierung von Uplift-Modellen zur interaktiven Besucheransprache.
Durch ausgeklügeltere Modelle kann Akanoo Kunden gezielter ansprechen und die Conversion Rate für seine Kunden verbessern. Ich möchte hier einen kleine Einführung in das Uplift Modelling zur gezielten Besucheransprache geben.

# Uplift Modeling

Often times an important question in direct marketing is which customers to target. This is because some customers will not react to the campaign, while still incurring the variable costs of marketing. Worse, some customers might react negatively to receiving an ad. Generally, there are four types of customers in direct marketing:

1. Customers who will respond without treatment,
2. customers who will only respond after receiving a treatment,
3. customers who will not respond because of a treatment, and
4. customers who will not respond regardless.

The goal of uplift modeling is to differentiate customers in group 2 from those in the other groups.

To build an uplift model one needs the result from an A/B-testing experiment. Using two groups and a binary outcome will give four possible combinations as shown below.

Figure 1: Four possibilities in an A/B-test experiment. (Shaar2016)

Given the result one can calculate the uplift as

P(return|treatment)-P(return|control).

Unfortunately, this term is not defined for a single customer, so we cannot simply build a model that maximizes it. Instead there are serveral approaches to maximize it indirectly.

## Building a model

The simplest one neglects the control group and models the probability that a customer will return given that she received treatment and given some customer specific information.

P(return|treatment,information)

Throwing away half the data isn’t optimal. For instance this model doesn’t predict which customers will only return when given some treatment, as we required above. Instead it will give all customers that will return regardless of treatment.

A simple extension is to build two models, one for each term in the uplift equation. Formally,

P(return|treatment,information)-P(return|control,information).

While this approach directly models the uplift effect, research suggests that it doesn’t always perform well in practice. One reason for that is that the treatment effect is usually much smaller than the main effect, e.g. if the main effect is 1.0% in the control group and the combined effect is 1.1% in the treatment group, then the uplift effect is only 0.1% (=1.1%-1.0%). Consequently the individual model will mostly focus on the main effect within each group. (Radcliffe, 2011)

An easy way to a single model approach is to apply a class variable transformation to the result from the A/B-test. Both the TR and the CNR groups will be assigned to the positive group (e.g. treatment) and TNR and CR will be assigned to the negative group (no treatment).

Table 1:  Example of the class variable transformation.

 Customer Group 1=Treatment 0=Control Returned 1=Returned 0=Not Returned Transformed 1 0 1 0 2 1 0 0 3 0 0 1 4 1 0 0 5 1 1 1 6 0 0 1

The idea is that from the four fields, we would definitely like to treat the customers in the group with the TR outcome. And because we don’t know whether the customers in the CNR would have responded if they had received a treatment, we would also like to treat them. It can be shown that under some assumptions modeling the conditional probability of the transformed class variable is equivalent to modeling the conditional probabilities of the two original variables from uplift equation above.

There are more elaborate techniques that incorporate the maximization of the uplift equation into the training algorithm. However, the straightforward implementation and applicability to any standard machine-learning algorithm make the two models approach and the class variable transformation approach very attractive first choices.

## Meet Our Data Scientist Gundula

Gundula works as a data scientist at Akanoo. She implements new features for our predictive models. Furthermore, she spents a lot of time on preparing and evaluating experiments and monitoring the nightly training cycles.

How long have you been working for Akanoo and what have you done before that?

I started working with Akanoo in November 2014. Before, I did a PhD in computational neuroscience: My research was about the processing of auditory information of calling songs in the brain of grasshoppers.

I enjoy the vast amount of data that we analyze.

And what do you usually do after work?

On my way back home, I enjoy the Außenalster!

Which colleague would you take with you to a lonely island and why?

I would take Carole, because she is a great swimmer.

Which attraction/part of Hamburg do you think you should definitely visit and why?

This summer, I discovered Wilhelmsburger Inselpark, though it looks quite artificial. Altes Land is lovely, too, especially during spring when the apple trees bloom.

Which apps could you never do without and why?

Spotify.

These blogs/websites belong to my daily reading:

On the web you can find me here:

## Lift analysis – A data scientist’s secret weapon

Whenever I read articles about data science I feel like there is some important aspect missing: evaluating the performance and quality of a machine learning model.

There is always a neat problem at hand that gets solved and the process of data acquisition, handling and model creation is discussed, but the evaluation aspect too often is very brief. But I truly believe it’s the most important fact, when building a new model. Consequently, the first post on this blog will deal with a pretty useful evaluation technique: lift analysis.

Machine learning covers a wide variety of problems like regression and clustering. Lift analysis, however, is used for classification tasks. Therefore, the remainder of this article will concentrate on these kind of models.

### The reason behind lift charts

When evaluating machine learning models there is a plethora of possible metrics to assess performance. There are things like accuracy, precision-recall, ROC curve and so on. All of them can be useful, but they can also be misleading or don’t answer the question at hand very well.

Accuracy1 for example might be a useful metric for balanced classes (that is, each label has about the same number of occurrences), but it’s totally misleading for imbalanced classes. Problem is: data scientists have to deal with imbalanced classes all the time, e.g. when predicting if a user will buy something in an online shop. If only 2 out of 100 customers buy anyway, it’s easy for the model to predict everyone as not buying and it still would achieve an accuracy of 98%! That’s absolutely not useful, when trying to assess the model’s quality.

Of course, other metrics like precision and recall give you important information about your model as well. But I want to dig a bit deeper into another valuable evaluation technique, generally referred to as lift analysis.

To illustrate the idea, we’ll consider a simple churn model: we want to predict if a customer of an online service will cancel its subscription or not. This is a binary classification problem: the user either cancels the subscription (churn=1) or keeps it (churn=0).

The basic idea of lift analysis is as follows:

1. group data based on the predicted churn probability (value between 0.0 and 1.0). Typically, you look at deciles, so you’d have 10 groups: 0.0 – 0.1, 0.1 – 0.2, …, 0.9 – 1.0
2. calculate the true churn rate per group. That is, you count how many people in each group churned and divide this by the total number of customers per group.

### Why is this useful?

The purpose of our model is to estimate how likely it is that a customer will cancel its subscription. This means our predicted (churn) probability should be directly proportional to the true churn probability, i.e. a high predicted score should correlate with a high actual churn rate. Vice versa, if the model predicts that a customer won’t churn, then we want to be sure that it’s really unlikely that this customer will churn.

But as always, a picture is worth thousand words. So let’s see how an ideal lift chart would look like:

Here you can see that the churn rate in the rightmost bucket is highest, just as expected. For scores below 0.5, the actual churn rate in the buckets is almost zero. You can use this lift chart to verify that your model is doing what you expect from it.

Let’s say there would be a spike in the lower scored groups; then you know right away that your model has some flaw, it doesn’t reflect the reality properly. Because if it would, then the true churn rate can only decrease with decreasing score. Of course, lift analysis can help you only that far. It’s up to you to identify the cause of this problem and to fix it, if necessary2. After improving the model, you just can come back to the lift chart and see if the quality improved.

Additionally, I drew a black line for the hypothetical average churn rate (20%). This is useful to define a targeting threshold: scores below the threshold will be set to 0, scores above to 1. In our example, you might want to try to keep customers from cancelling their subscription by giving them a discount. Then you would target all users with a score between 0.8 and 1.0, because this is the range where the churn rates are higher than the average churn rate. You don’t want to pour money down the drain for customers, who have a below-average churn probability.

### But what is lift exactly?

Until now, we only looked at nice charts. But usually you’re interested in the lift score as well. The definition is pretty simple:

$\displaystyle lift = \frac{predicted\ rate}{average\ rate}$

rate in our situation refers to the churn rate, but might as well be a conversion rate, response rate etc.

Looking back at our example chart, the highest group would have a lift of 0.97 / 0.2 = 4.85 and the second highest group of 1.8. That means, if you only target users with a score higher than 0.9, you can expect to catch nearly five times more churning users than you would by targeting the same number of people randomly.

### Conclusion

Just like every other evaluation metric lift charts aren’t an one-off solution. But they help you get a better picture of the overall performance of your model. You can quickly spot flaws, if the slope of the lift chart is not monotonic. Additionally, it helps you to set a threshold, which users are worth targeting. Last but not least, you have a estimate how much better you can target users compared to random targeting.

I hope this first blog post gave you some new insights or you enjoyed it as a refresher. If you have any questions or feedback, just leave a comment or shoot me a tweet.

1. Ratio of correctly labeled observations to total number of observations.
2. There might be cases where this does not matter, e.g. when your main goal is to target everyone who churns, but it doesn’t matter, if you also target some people who won’t churn.

(First published on datalifebalance.com)

## Business Opportunities in Click Stream Data Mining – from Cart Abandonment Prevention to Upselling

When we started Akanoo 2½ years ago, none of us would have believed that the click-stream of online shop visitors reveals so many applications of statistical analysis. Our goal was simple: use JavaScript to display a voucher to people on the first pages if you are certain they otherwise won’t buy. A logistic regression on the target variable “will buy” and a web service written in Groovy for real-time prediction did the job.

This led to roughly 10-20% of incremental revenue compared to control groups. But only on a small group of people: those that we knew for sure won’t buy.

So, we took a deeper look at the behavior of the millions of online shop visitors that we had tracked so far. And we quickly realized: there is a business opportunity way bigger than that. We saw visitors responding negatively to vouchers and some of them becoming less profitable in the long run compared to the control group. So we had to rethink our initial solution.

Let’s draw an analogy here: most online retailers offer the same assistance as a supermarket – none. However, there are millions of visitors out there that rather like the shopping experience of a fashion outlet or shoe retailer (they know from the offline world). There is staff around that guides you through the offering. How can we deliver both at the same time?

The solution is: create algorithms that have a lot more freedom. We started working on algorithms that are able to predict the intention and next steps of visitors and pick the right way of interacting with these visitors from an array of different incentives and pieces of information. With the ultimate goal: converting visitors into happy customers.

Now we’ve expanded the audience of our JavaScript targeting to a wide array of scenarios besides cart abandonment prevention: up-selling, inspiration, return reduction, loyalty. (And we started using Spark, Hadoop, Docker, Redis and Akka to handle the data…)

What’s next? The data we analyze and track on a daily basis is growing steadily. Just for the up-selling models, we look at the data of 11M users every night. Other kinds of transaction-based websites come into play and modeling of more complex decision making processes.

Who makes this possible? The data science team behind Akanoo.
Yours, Fabian