Choosing Goals for A/B Testing

One of the most important decisions when designing an A/B test is choosing the goal of the test. After all, if you don’t have the right goal the results of the test won’t be of any use. It is particularly important when using Myna as Myna dynamically changes the proportion in which variants as displayed to maximise the goal.

So how should we choose the goal? Let’s look at the theory, which tells us how to choose the goal in a perfect world, and then see how that theory can inform practice in a decidedly imperfect world.

Customer Lifetime Value

For most businesses the goal is to increase customer lifetime value (CLV). What is CLV? It’s simply the sum of all the money we’ll receive in the future from the customer. (This is sometimes known as predictive customer lifetime value as we’re interested in the revenue we’ll receive in future, not any revenue we might have received in the past.)

If you can accurately predict CLV it is a natural goal to use for A/B tests. The performance of each variant under test can be measured by how much they increased CLV on average. Here’s a simple example. Say you’re testing calls-to-action on your landing page. The lifetime values of interest here are the CLV of a user arriving at your landing page who hasn’t signed up, and the CLV of a user who has just signed up. If you have good statistics on your funnel you can work these numbers out. Say an engaged user has a CLV of $50, 50% of sign-ups go on to become engaged, and 10% of visitors sign up. Then the lifetime values are:

  • for sign-ups $50 * 0.5 = $25; and
  • for visitors $25 * 0.1 = $2.50.

The great thing with CLV is you don’t have to worry about any other measures such as click-through, time on site, or what have you – that’s all accounted for in lifetime value.

Theory Meets Practice

Accurately predicting CLV is the number one problem with using it in practice. A lot of people just don’t have the infrastructure to do these calculations. For those that do there are other issues that make predicting CLV difficult. You might have a complicated business that necessitates customer segmentation to produce meaningful lifetime values. You might have very long-term customers making prediction hard. I don’t need to go on; I’m sure you can think of your own reasons.

This doesn’t mean that CLV is useless, as it gives us a framework for evaluating other goals such as click-through and sign-up. For most people using a simple to measure goal such as click-through is a reasonable decision. These goals are usually highly correlated with CLV, and it is better to do testing with a slightly imperfect goal than to not do it at all due to concern about accurately measuring lifetime value. I do recommend from time-to-time checking that these simpler goals are correlated with CLV, but it shouldn’t be needed for every test.

CLV is very useful when the user can choose between many actions. Returning to our landing page example, imagine the visitor could also sign up for a newsletter as well as signing up to use our product. Presumably visitors who just sign up for the newsletter have a lower CLV than those who sign up for the product, but a higher CLV than those who fail to take any action. Even if we can’t predict CLV precisely, using the framework at least forces us to directly face the problem of quantifying the value of different action.

This approach pays off particularly well for companies with very low conversion rates, or a long sales cycle. Here A/B testing can be a challenge, but we can use the model of CLV to create useful intermediate goals that can guide us. If it takes six months to covert a visitor into a paying customer, look for other intermediate goals and then try to estimate the CLV of them. This could be downloading a white paper, signing up for a newsletter, or even something like a repeat visit. Again it isn’t essential to accurately predict CLV, just to assign some value that is in the right ballpark.

Applying CLV to Myna

So far everything I’ve said applies to general A/B testing. Now I want to talk about some details specific to Myna. When using Myna you need to specify a reward. For simple cases like a click-through or sign-up, the reward is simply 1 if the goal is achieved and 0 otherwise. For more complicated cases Myna allows very flexible rewards that can handle most situations. Let’s quickly review how Myna’s rewards work, and then how to use them in more complicated scenarios.

Rewards occur after a variant has been viewed. The idea is to indicate to Myna the quality of any action coming from viewing a variant. There are some simple rules for rewards:

  • any reward must be a number between 0 and 1. The higher the reward, the better it is;
  • a default reward of 0 is assumed if you don’t specify anything else; and
  • you can send multiple rewards for a single view of a variant, but the total of all rewards for a view must be no more than 1.

Now we know about CLV the correct way to set rewards is obvious: rewards should be proportional to CLV. How do we convert CLV to a number between 0 and 1? We recommend using the logistic function to guarantee the output is always in the correct range. However, if you don’t know your CLV just choose some numbers that have the correct ranking and roughly correct magnitude. So for the newsletter / sign-up example we might go with 0.3 and 0.7 respectively. This way if someone performs both actions they get a reward of 1.0.

That’s really all there is to CLV. It a simple concept but has wide ramifications in testing.

New Features Across the Board

Today we're announcing the next version of Myna. This brings a lot of improvements, some of the highlights being:

  • you can associate arbitrary JSON data with an experiment. You could use this, for example, to store text or styling information for your web page. This allows you to change an experiment from the dashboard and have the changes appear on your site without redeploying code;

  • Myna is much more flexible in accepting rewards and views. This enables experiments that involve online and offline components, such as mobile applications;

  • we have a completely new dashboard, which is faster and easier to use than its predecessor.

If you want to get started right away, login to Myna and click the "v2 Beta" button on your dashboard. This will take you to the new dashboard, where you can create and edit experiments. Then take a look at our new API, part of an all new help site.

Alternatively, read on for more details.

The New API

The changes start with our new API. The whole model of interaction with the API has changed. The old model was to ask Myna for a single suggestion, and send a single reward back to the server. There were numerous problems with this:

  • Latency. It took two round trips to use Myna (one to download the client from our CDN, one to get a suggestion from our servers).
  • Rigidity. Myna entirely controlled which suggestions were made, and only these suggestions could be rewarded.
  • Offline use. Myna's model didn't allow offline use, essential for mobile applications.

The new API solves all these issues.

Instead of asking Myna for a suggestion, clients download experiment information that contains weights for each variant. These weights are Myna's best estimate for the proportion in which variants should be suggested, but clients are free to display any variant they wish. The client can store this information to use offline or to make batches of suggestions.

Views and rewards can be sent to Myna individually or in batches, and there are very few restrictions on what can be sent. If you want to send multiple rewards for a single view, that can be done. There are no restrictions on the delay between views and rewards, so those of you with long funnels can use Myna.

Since you don't have to contact Myna's servers to get a suggestion, all data can be stored in a CDN. This means only a single round-trip, to a fast CDN, to use Myna.

These features combine to make Myna faster for existing uses on websites, and also to allow new uses, such as mobile applications that work offline.

Another major change is to give you more control over experiments from your dashboard. To this end you can associates arbitrary JSON data with your experiments. You can use this data to set, say, text or style information in your experiments. Then any changes you make on your dashboard, including adding new variants, will be automatically reflected in your experiments without deploying new code.

We have also improved the deployment process. Instead of pulling experiments into a page one-by-one, we provide a single CDN-hosted file that contains all your active experiments and the Myna for HTML client.

Finally, we've updated the algorithm Myna uses. It behaves in a more intuitive fashion without sacrificing performance.

The new API is live and is being used in production right now.

New Dashboard

The old dashboard wasn't up to scratch. It was difficult to use and wasn't able to support the new features we're adding to the API. As a result we've created a completely new dashboard. Click the "v2 Beta" tab to access it.

The dashboard is still in development, so there are some rough edges. However it's usable enough that we're releasing it now.

New Clients

Along with the new API we are also developing new clients. Most of you integrate with Myna in the browser, and here we have new versions of Myna for Javascript and Myna for HTML.

Possibly the most exciting new feature is the inspector, which allows you to preview your experiments in the page. Here's a demo. To enable the inspector, just add #preview to the end of the URL of any experiment that uses the new version of Myna for HTML or Myna for Javascript.

Documentation is still in progress for these clients. Look here for Myna for HTML, and here for Myna for Javascript.

What's Next?

There is still a lot of work to do. In addition to finishing the dashboard and documentation we are working on iOS and Android clients. Beyond that we have lots of exciting features in development, which you'll hear more about as they near completion.

Five things you could be A/B testing in addition to the colour of your buttons

One of the most popular questions for new A/B testers is “What should I test?“. There are many blog posts on the Internet that describe testing strategies, effective tests for landing pages, the psychology of testing, and so on. Rather than describe about the usual candidates (hero image, feature list, call to action – you can test them all), I thought I’d provide a fresh perspective on some creative tests that you may not have considered.

1. Signup process

How simple should your landing/signup pages be? If you take away distractions, and give a clean, simple interface that focuses the user, will you see an increase in signups? If you take away too much relevant information, will it turn users off to your site and make them less likely to complete the signup process?

How simple should your signup process be? If you remove distractions and provide a clean, simple interface that focuses visitors’ attention, will you see an increase in signups? Take away too much, or force visitors to sign up before they know what you do or why you need their information and you may scare them away.

2. Information capture

You can test anything along your pipeline from customer acquisition to post-purchase follow-up. Most people like to start with the front-end of the pipeline: SEO to make their web site more visible, and web site optimization to make their landing pages more effective.

Collecting information from potential customers is a tricky business. Collect too little and you’re missing out on potentially useful insights. Ask for too much and you risk putting people off filling in your form. In addition to testing the visual aspects of form design (label positioning, immediate feedback on potential errors, and so on), one of the most useful things you can do is test the amount and format of data you’re trying to capture. After all, all the data in the world is no use to you if it never leaves your customers’ heads!

3. Email content

Communication with your customers is critically important, and it’s not just your website that matters. Emails form a vital part of many businesses’ communication strategy. Take your business’s voice as an example. Do your users prefer more personable, friendly style? Or do they want reassurance that they’re dealing with professionals? Test the style, language and layout of your email content, and measure the response rate or traffic generated over a number campaigns.

4. Features and pricing

Features and pricing are the two main things that affect your value proposition to customers. Fortunately, you can test both of these to see where your value sweet spot lies.

One great way of testing pricing is to test discounts. For example, you could offer one group of customers three months of free use of your product, and another group 25% off for a year. These offers are financially equivalent but offer different value in the short term.

The charitable promotions web site Humble Bundle operate another great example of price testing. They use A/B testing to prompt customers with different suggested donations, to determine the best times to add values to their deals.

Instead to testing pricing, you might consider testing features of your online product. What would happen if you gave users twice the space to upload photos? Would it lead to a financially beneficial increase in sign-ups? Why not run an A/B test to find out?

5. Checkout flow

Many of the tests we explored above for signup processes can also be applied to checkout flow. How much should you focus customers’ attention on completing checkout and paying for the product? What order should you collect payment and delivery information? Is there a good opportunity for additional information capture here?

You can be a lot more flexible in tests by realising that, with the right testing tools, conversion goals don’t need to be “yes/no” affairs. Some tools let you assign secondary goals or, even better, numerical goals that let you stipulate how good the outcome is. For example, in a checkout flow you might instruct your testing tool to put more weight on a conversion if the user has more items in their basket, returns to a basket after saving it, chooses a faster delivery option, and so on.

 

Hopefully these five crazy off-the-wall examples will inspire you to use A/B testing to improve your product or site in new and unconventional ways. We’ve seen all of these and more at Myna, and we’re constantly being surprised by the resourcefulness and creativity of our users. This isn’t to say that we don’t advocate conventional testing of strap lines, hero images, and button colours – these tests are and always will be perfectly valid. No matter how much you’ve tested your landing pages, though, it’s important to realise that there are always new optimizations to make. Who knows – your next test may just bust you out of a local maximum and take you to a whole new level of success.