The 3 Biggest Mistakes Marketers Make When A/B Testing on Mobile

A/B tests are a marketer’s bread and butter and for good reason. They are an incredibly powerful tool for making data driven product decisions…but only when they are set up and analyzed correctly. We’ve identified three major mistakes that we frequently see marketers make when A/B testing their mobile apps. These mistakes can render your test useless, and if they’re not caught, you run the risk of making decisions based on fickle data.

The good news is, they’re really easy to avoid with the right mobile marketing optimization tools.

1. The Sample Size is Too Small

Screen Shot 2014-11-04 at 8.41.17 PMBigger is usually better when running an A/B test, but at the very least you need to make sure you have enough test subjects for the test to be significant. This seems obvious in theory, but in reality it can be hard to get a large enough sample size when your app is just starting out or when you want to test a specific demographic.

Most marketers know this but sometimes pacify their sample size apprehensions by claiming they want to get a basic idea of the outcome (what if they test 20 users and a whopping 15 of them prefer option A?!). The major problem with this is that they develop a framework for making further decisions based on a test that is inconclusive at best (what if they test 20 more users and 15 of them prefer option B?). This could cause major product issues down the line.

The Leanplum dashboard automatically shows you a visual display of the significance of your test over time at your chosen confidence interval, so be sure you don’t jump the gun and ignore it!

2. The Test is Stopped Too Early

Though most grown adults no longer punctuate long car trips with “are-we-there-yets,” this impatience has manifested itself in other ways that are less annoying but worse for business.

Make sure you carefully calculate how long an A/B test should run (remember to factor in typical user behavior: do they log in daily? Weekly? Monthly?) and do NOT shut down your A/B test before the set deadline.

The most frequent scenario we see is – a diligent marketer, anxious to see how a test is running, checks it every day until all of a sudden he sees one of the variants skyrocket above the rest! He immediately turns off the test and implements the “winning” variant into his app.

a b mobile testing

The fact is, that initial spike may have little impact on his test results in the very end.

It’s dangerous to cut a test short just because you think that one variant is winning since there is still room for things to change drastically (usage on that particular day might be skewed for some external reason). Let your test run its course, and try to resist jumping to conclusions too early.

A handful of tools advertise “rules of thumb” that you should follow to determine the length of your test, e.g. “wait at least a week.” Leanplum, however, uses the power of statistics to tell you exactly how many days your test should span in order to gain statistically significant results.

3. People Forget to Look at Everything Else!

It can be very exciting when you run an A/B test comparing clickthrough rates for two different buttons and the result shows an obvious winner by a long shot. The decision to implement the winning button seems like a simple one…but don’t take action yet!

Always remember to look at macro-level metrics. A button may be getting 100% more clicks but your overall revenue might be plummeting. Maybe the winning button says “Click Me and I Will Bring You Cake!” and people click on it, but are so disappointed about not actually receiving any cake, they never purchase from you again. That’s a silly example, but it drives the point home. You want to make sure you’re targeting the right users and that they continue down your funnel. The only way to really see that is by looking at overall analytics.

ab test on a mobile appThe Leanplum analytics dashboard automatically analyzes all of the key metrics for you – just make sure you always look beyond your goal metrics and favorite metrics to the Significant Changes section and check for any red flags. These flags will alert you if your test variant has resulted in undesirable outcomes that could impact your decision.

And with that, happy testing!

UP NEXT -> App of the Week: Tictail