We've created Zuko, our next-generation form analytics platform. Explore Zuko Explore Zuko > Hide Message

Advice On Failed A/B Testing – What The Experts Say

A/B Testing

The mantra of the conversion optimiser is, test test test. Opinions aren’t enough, in fact they count for very little as we all have our own biases. No optimisation can just be rolled out, we do not know for sure that it will benefit conversion, or the customer lifecycle. Our ideas for optimisation need to come from data and the promised improvements backed by evidence.

The process is hardly cut and dry though, as we discovered when we applied these principles to our own site. We’re currently testing a page with quite significant design changes. We’re not tweaking the button copy or trying a different image, we’ve overhauled the page. You can see screenshots of the pages we’re testing in the folder Pricing Page Experiment.

The results are, as yet, inconclusive but, with so much thought and effort gone into the new design, we’re rooting for it. Guiltily.

A crowd cheering

We’re looking at more than just the conversion rate of the two designs, we’re looking at the average order value and the customer lifecycle too. Ours is not a busy ecommerce site though, getting the traffic needed to draw conclusive results from the test is slow. The wait is agonising, even saddening.

So, I turned to the Inbound community and some of my favourite LinkedIn groups to ask, “Has this happened to you?”

Instead of wallowing in sad tales of failed tests I got advice from the brilliant Ed Fry.

“Most A/B tests are likely to fail. Marie Polli shared a great deck on innovative vs. iterative testing which might be particularly helpful for you.”@edfryed

Motivated by Ed and set on the path of some useful resources, I continued to explore the subject of failed A/B testing. Here’s what I learned:

Most A/B tests are inconclusive

Just because you test doesn’t mean you’ll get a usable result. Appsumo shared their own experiences of running tests and revealed,

“Only 1 out of 8 A/B tests have driven significant change.” – Noah Kagan, @noahkagan

That statistic is a little staggering but Appsumo’s experience isn’t unique. Justin Megahan of Mixpanel addressed the scale of unusable results in his post, Why most A/B tests give you bullshit results.

“…many new testers walk into A/B testing thinking it’ll be quick and easy to get results. [] Then they start running tests on their apps or sites, and reality suddenly sets in. Tests are inconclusive.”@justinrmegahan

Megahan identifies meaningless tests are the main cause of inconclusive results. Testing variants that will never produce significantly different results in the first place leads many new optimisers to create poor experiments.

“We waste our time testing variants without any real meaningful differences.”@justinrmegahan

Alex Turnbull, the founder of Groove, feels that publishing positive A/B testing results can cause a skewed image of the process. It looks easy when you only talk about your success. Positive case studies are also tempting to try to replicate – which rarely works.

To add balance to the case studies available to read, he shared 6 A/B tests that did nothing at all for Groove. All 6 tests tweaked small elements, seeking incremental uplifts from changes in copy, colour and alternative of social proof.

3 variations of pricing copy

Groove’s pricing page copy variations

All 6 had inconclusive results. However, rather than giving up on testing, the message is, do more tests.

“…long-term results don’t come from tactical tests; they come from doing the hard work of research, customer development and making big shifts in messaging, positioning and strategy.

Then A/B testing can help you optimize your already validated approach.” – Alex Turnbull, @alexmturnbull

Design overhauls are risky but necessary

Going back to our own experience for a minute, we knew the pricing page needed an extreme face lift. It was really missing a lot of good things and so we felt were almost starting from scratch.

This is known as radical innovation and it’s much harder to test.

Marie Polli’s presentation at ConversionXL Live 2016 discussed When, Why and How To Do Innovative Testing:

[CXL Live 16] When, Why and How to Do Innovative Testing by Marie Polli from ConversionXL

Polli defines innovative testing as including any of the following:

  • Navigation changes
  • Radical redesign
  • New functionalities and features
  • Combination of multiple changes

In general optimisers prefer iterative tests, where changing just one thing and measuring the result gives a clear indication of the effect of that change. Changing multiple elements is much harder to test. It’s harder to plan for too as you’re hoping to create a new experience that you have no data for. Sounds risky right?

So, when is it appropriate to take the risk and go for innovation?

Polli answers:

  1. When an iteration won’t suffice.
  2. When your testing potential is small.
  3. When you don’t have much traffic.

We felt we fit the brief. As mentioned previously, we don’t have the high-volume traffic really needed for A/B or multivariate testing. We had also tried iterative changes on the page already. To compound it all, the original page really only had 2 elements, a sparse banner at the top and the account options below, not a great deal to play around with.

And what are the payoffs?

Iterative changes, on average, yield less than 10% uplift, successful innovative changes yield greater than 10% uplift. We had to try.

Testing the results of innovative design changes

A big motivation for us was the fact that we have low traffic, making it hard to test. However, we want proof that our new design is better. What kind of data can tell us that?

Peep Laja, ultimate fountain of knowledge on all questions on optimisation (and the founder of ConversionXL), says on the subject of How To Do Conversion Optimisation With Very Little Traffic,

“If you don’t have enough traffic to test, just roll out the changes (all at once) and observe the impact on your KPIs.”@peeplaja

For the greatest chance of success on that score, it’s all in the planning. Peep recommends doing a teardown of your site, assessing based on the following:

  • Clarity: Is the product or service offering clear?
  • Friction: How can this be reduced to create a smooth customer journey?
  • Distraction: what’s competing for the visitor’s attention?
  • Urgency: is there anything that makes your offer immediately compelling?
  • Where are visitors coming from and is their experience cohesive?
  • Do you know what buying phase each visitor is in? Get Csaba Zjado’s tips on differentiating between hot and cold prospects.

There are some optimisations you can feel confident about

Although we find it difficult to run tests on page variations, we know our site could be better and that’s why we keep trying to improve it. Low traffic shouldn’t be a blocker for you either.

Aaron Orendorff proposes that there are 9 safe optimisations you can make when you can’t test. These, for the most part, address the issues you’re likely to find during heuristic analysis:

  1. Speeding up your site.
  2. Cut all competing calls to actions.
  3. Reduce visual clutter.
  4. Clarify your message, everywhere, from keywords to PPC ads to the landing page or web page the visitor hits.
  5. Know who you’re talking to and address them.
  6. Ensure attention by appealing to emotion.
  7. Fulfil your users’ desires.
  8. Use fear as a motivator.
  9. Optimise for trust.

The lesson? Optimisation needs to be applied in different ways for different sites. You need high traffic to test variations but if you don’t have it there are alternatives. The real measure of success is in improving your business objectives (KPIs), not increasing your conversion rate.