In the previous post, What Factors Impact The Duration Of Your Multivariate Test?, I had mentioned about two elemental factors that impact it — the Traffic Volume and Test Combinations. I had illustrated the effect of volume due to the number of combinations to get a better idea of its impact on the Test. However, there are situations when an MVT fails to provide a definite winner. Reason – Results are not statistically significant. Does it seem like doors to a happy ending are suddenly closed?
Tests like MVT based on quantitative outputs has to achieve statistical significance before they can be used for making any decision. Statistical significance means that the test variation has an XY% chance that the expected outcome is likely to happen. Alternately, there is a (100-XY)% chance that the expected outcome is not likely to occur.
But before drilling further, if possible, evaluate why there weren’t statistically significant results. There are few basics and best practices of MVT that would be required to retrace and check again.Out of those, I would like to highlight the following three:
- Revalidate whether the page elements you have selected for testing have strong impact on the results
- Why do you need to check this?If the page elements selected for testing do not provide any significant motivation to the end customers while making their decision, then ignore that element. Choose only those page elements which are worth testing and not end up testing everything possible.
- Ensure that the alternatives of each page element being tested are notably different from each other
- Why do you need to check this? Tests are executed to find what the end customers prefer most. In scenarios where the minor differences do not make any significant impact in the browsing behavior or decision of the target customers, it would be a waste of time and effort trying to test that hypothesis.
- Check if the elements being tested are independent of each other
- Why do you need to check this? When running an MVT, if the page elements being tested have any relation among themselves, the test results will not be able to quantify the effects of one on another.
Statistical significance still reigns
Despite following these best practices, a sufficiently long running Multivariate Test could give results that are not statistically significant enough. Here is an example for reference. Let’s take a website www.myappliance.com that sells custom appliances and takes orders online but needs visitors to pick up their orders from the stores.
After an MVT, the results were found to be as follows:
In this example, though we have Alternative 3 with 9.34% Conversion Rate, it only explains that there is only 81% chance that this version would give such a wonderful conversion rate. 81% is good, but having a confidence level of at least 95% leaves out just a small fraction of 5% to chance that it would not occur. Alternative 4, in that case, could be better than Alternative 3, right? Wrong! The conversion of 7.44% got from Alternative 4 still looks better, but then we would still not want to leave 8% to chance. In this scenario, in fact, we don’t have a statistically significant result. What do we do in such a case?
Tips to arrive at statistically significant results
Here are 5 tips that could help arrive at largely craved statistically significant result from MVTs:
- Refer to the three basic points highlighted under the best practices. If any of those needs a fix, probably it’s too late to fix in the middle of the test. Consider changing Alternatives, if absolutely required.
- In the event where best practices have been followed, you may continue running the test as it is, so that the Confidence Level reaches 95% and above; and narrow down based on the Confidence Interval.
- Alternately, in the event where best practices have been followed, modify the test marginally and continue running the test, so that the Confidence Level reaches 95% and above; and narrow down based on the Confidence Interval.
- Review the test sample size and also monitor all or some of the Alternatives are producing better conversions. Though your test does not have a statistically significant confidence level yet, the impact of the test on the website’s overall traffic and conversion could be kept in control by increasing the test sample size.
- Review the visitor segments that are arriving at the Test Alternatives if you are running a targeted testing. The visitor segment definition and configuration would have to be revised to get the right visitors in sufficient volume.
In addition to rechecking these tips, the optionor decision has to be taken by the marketers collectively to achieve the online objectives.Tip # 2 could be confusing, but it is advised to be followed when it comes to either a “do it” or “kill it” choice. MVTs of ample variations are notoriously time-consuming. It would be easier for large websites with ample traffic volume to wait for the test to produce a winner of statistical significance. The wait can be short depending upon the number of variations being tested, but still it doesn’t guarantee that it would always lead to a significant winner. Hence, it is advisable to have the exit strategy for your test ready.
So, what’s your experience with MVTs that you have run? What measures did you take when your MVT didn’t return statistically significant results?