Growth Hacking and the Bandit Problem

“Growth Hacking and the Bandit Problem” is a recent talk by yours truly introducinga recent talk by the awesome Noel introducing multi armed bandits as a superior way of A/B testing. In case you missed it we decided to write it up as a blog post.

Our Hero

We begin by introducing our main character and hero of the story, the growth hacker. He is driven by one thing and one thing alone: pushing growth ever upwards and ever rightwards. Swoon.

To do this he follows 3 simple steps: Build, Measure, and Learn, a process handed down by the prophet of growth, Eric Ries. These 3 steps give our hero what he needs: a structured process to drive growth. Starting at the top, he gets something built, let’s say a new sign up page. He then takes some time to collect data and measure its effectiveness. Once this is over he sits down with his data and he learns, making a decision based on those results and informing the next iteration of the cycle. Round he goes again!

Our growth hacker will use his wide range of skills at each stage of the cycle but his main objective is always achieving rapid growth. The speed at which he can get round this cycle will determine how fast and how far his metrics, and ultimately the business, can grow.

Faster is Better

Driven by a need for speed, our growth hacker takes a look at each step in his engine of growth to see where he can go faster. He starts off with build. Hmm there doesn’t seem to be much he can do here. Our growth hacker’s already pretty agile on the dev front.

Learning already seems to happen pretty fast, once he’s got all his data together. But measure? Now measure seems like a place where he might be able to speed up. At the moment he’s using A/B testing. Collecting all the data he needs to make a sound statistical decision takes a long time.


What if measuring and learning could happen together? What if we could turn our 3 step process into a 2 step process and speed it up dramatically? We could change our metrics chart to look like the green line instead of the orange, allowing our growth hacker to iterate and optimise as fast as he can! Well surprise surprise, you guessed it, our hero the growth hacker has just discovered the multi armed bandit, a way to drive growth faster than ever before!


The Multi Armed Bandit (Growth Hackers’ Secret Sauce)

“Woah there,” I hear you say. “Let’s just hold on a minute here and have a bit of background into this multi-armed bandit. Where is it from and what’s it all about?” Well, like all good secret powers the multi-armed bandit started off as a problem; The Bandit Problem.

Imagine walking into a casino. You head straight for a room full of slot machines or, as they’re called in the US, one armed bandits. You’re a clever egg, so no doubt you’re thinking that some of these machines are going to pay out more than others. You want to make sure you maximise your reward by finding and playing the bandit that pays out most. This is the bandit problem.

After a long hard think, and a lot of maths, you come up with a formula that helps you to find the machine that pays out most as soon as possible. Hurray! This means you don’t waste your money trying other machines that pay less often. The formula is called a bandit algorithm, and with this in hand our growth hacker receives the mighty powers of the multi-armed bandit!

Now Back to the Story

With his trusty multi armed bandit at his side our growth hacker can now set to the task in hand. Today he’s increasing conversions on a signup page. With 3 variants of the web page to choose from the MAB shows a different page to each visitor, and its reward comes when a visitor clicks on the orange button and converts. The scores are totted up and the process is merrilly repeated each time a new visitor comes to the site.

Two key Ingredients to the Secret Sauce

So far so good, but doesn’t this sound familiar? Up to this point our MAB has been purely exploring, trying variants at random and totting up their scores which is the main feature of A/B testing. But remember the goal of a MAB is to maximise the total reward, so to do this it’s going to bring another element into the mix, exploitation, which is showing the variants that have worked best in the past. This delicate balancing act between exploration and exploitation plays out for the duration of the test as the multi armed bandit happily goes about measuring and learning, at all times working to maximise total reward.

What are My Options?

We’ve just sketched an algorithm known as E-Greedy. When it comes to MAB there are lots to chose from, such as E-Greedy, Thompson Sampling, UCB-1 or Myna. Not all will perform in the same way or deliver the same results, as you can see in the chart below: (Ooh look, Myna’s the most successful. Who would have guessed?!)

Real Life

Now I know what you’re thinking: fancy graphs based on simulated data are all well and good but I want to hear about some real life results. Look no further than one of our customers: Vizify, a startup working to create beautiful online portfolios. In order to improve their user engagement they decided to deploy Myna to optimise their email subject lines. Because Myna is so efficient with data, in just a few days Vizify had received a 500% increase in clickthroughs (pretty impressive for a startup with small amounts of traffic).

With Great Power comes Great Responsibility

When using MABs there are a few things to bear in mind:

Your workflow will change dramatically. It’s going to become simpler as Myna is going to do all the work for you. It will be faster, because its so efficient with data you’ll get results at lightening speed. It’s also way more flexible. You will wave goodbye to setting parameters in advance (experiment length and p-value), and can add and remove variants at any time, testing almost anything you set your mind to!
Defining rewards
The ideal reward measure for any A/B test is most likely customer lifetime value but you probably can’t measure this very quickly. You need to have a fairly fast feedback cycle so the algorithm can adapt in a reasonable time period. Using simple measures like conversions are fine, but with any test you should check that this correlates with your true performance metrics.
Stable Preferences
The algorithms we’ve discussed only work when users have stable preferences. We don’t mean that all users act in the same way, but rather that their behaviour is similar in aggregate and stable over time. Broadly speaking, we assume what works today will continue to work tomorrow. For UI elements this is generally the case, but it is not, for example, true for news items where the value of the story is strongly time dependant.

And that’s it. Our growth hacker’s secret sauce, the multi armed bandit has been transformed into an almighty T-Rex. Jump on its back and ride off into the sunset and do what growth hackers do best: grow, fast! RAWWWRR!

A/B testing and the parable of the missing keys

My wife misplaced her keys yesterday. I politely enquired why she couldn’t put her damn keys in the same place every time she came in. She opined that if I wanted to be useful I should do less work with my mouth and more with my eyes. And so we set to work finding them.

As we searched, my mind naturally turned to A/B testing. It was clear from the start that we had two different strategies for finding the keys. She exploited her knowledge of where she had put her keys in the past, and her actions immediately prior to losing the keys. I explored more or less at random, arguing that her approach was proving unsuccessful and we should abandon our prior assumptions. Either approach on its own is inefficient, but together we were able to cover a large portion of the house in a relatively short period of time.

The exploration-exploitation dilemma lies at the heart of Myna. Myna constantly balances exploiting the variants that have worked well in the past against exploring other variants to see if they are in fact better. Myna can make an optimal tradeoff due to the power of the algorithms, and the relatively simple structure of the A/B testing problem.

Designing A/B tests involves a similar balancing act. We can exploit our knowledge of prior tests and best practices (such as these) to guide us when creating our own experiments. However, we must be cautious not to rely on those common tests too heavily. What has worked before, or for others’ customers might not work now or for ours. Similarly, exploring any and every idea that pops into our minds may be very interesting, and potentially bring dramatic results, but this has to be balanced with the risk of confusion or wasting time.

As you can see, once you start looking for it, you’ll find the exploration-exploitation dilemma everywhere.

My keys

No prizes for guessing who found the keys. (PS: it wasn’t me.)

Why we created Myna

The Myna story really begins way way back in 1994, when I was in my second year of Engineering at UWA. One day I logged onto one of the School’s Sun workstations and saw a system message:

For Mosaic type xmosiac

So I typed xmosaic and discovered the web.

In 1994 Yahoo had only just been created, it would be a year before Amazon was online, and the research project that led to Google wouldn’t start for another two years. Yet despite the blink tags and “Under Construction” GIFs one thing was clear: the web was, and would be, something amazing. I was most struck by its essential equality. In those days anyone could create a web page and stand on equal footing with the rest of the world.

Fast forward 16 years and things have changed. The web is now big industry and ads, SEO, and other techniques are all used by businesses to give themselves an advantage. The Internet is dominated by large corporations, and it isn’t so easy for the little guy to be heard.

I happened to pick a field, machine learning, that has become one of the key differences between the big and small players. The big Internet properties have a substantial advantage by their use of intelligent algorithms to optimise their sites, product recommendations, and so on. It’s also clear that the small players can’t easily replicate this. Simply put, they don’t have the expertise to develop these systems in-house, and Google have already hired all the available PhD graduates.

This is where Myna comes in. We want to rebalance the Internet by democratizing access to the technology the big companies are using. Of course paying the bills is important, but fundamentally if we can push forward the industry we’ll have achieved something important.

If you’re not Google, Amazon, Yahoo!, or Microsoft (or even if you are) we hope you’ll give Myna a try. We’re just starting out on what we hope will be a long and eventful journey, and we look forward to growing alongside you.