Apply testing logic to the hypothesis creation

1. Use analytics data, expertise, intuition, or third-party advice to form a hypothesis and write it down.

For example, a hypothesis could be, “The higher conversion rate we see on mobile versus desktop is due to having a prominent floating CTA on our mobile product listing pages.” A possible test idea stemming from this claim would be adding a floating CTA to product pages and the homepage and determining whether it leads to better mobile conversion rates.

2. Note any possible causal paths that could result in the hypothesized effect.

For example, if the hypothesis is that a floating CTA on mobile product listing pages improves conversion rate, there are three obvious causal pathways for this to happen: People’s interactions with the CTA lead to more conversions. The mere exposure of the CTA and surrounding text makes people more likely to purchase (even without interacting with it). The presence of the CTA pushes down information that users were skipping before due to it being too high on the screen (uncommon, but possible). Focus on finding the causal paths, and not on which contributes the most or whether they operate together or individually.

3. Use analytics tools like Google Analytics or Adobe Analytics and the causal paths to predict the effect the hypothesis would have on relevant metrics if it were true.

For example, following the earlier hypothesis example, predict the expected effect on CTA clicks, CTA clicks to purchase completion, add to cart rate, and purchase completion rate based on the causal path of people interacting with the CTA driving the conversions. An example prediction would be that for the observed superiority of mobile over desktop to be due to the CTA, a minimum number of clicks and a corresponding CTA clicks to purchase completion rate would need to be present. Predicting just the direction is fine, but being more specific about the expected effect size would make for an even stronger hypothesis.

4. Use your analytics tools to look for evidence contradicting the effects you predicted.

For example, observing that the CTA is not really driving a meaningful number of clicks or that the conversion rate is very low, despite the high number of clicks. Finding such evidence should count against the main claim to the extent where an alternative mechanism does not exist. For instance, in this case it could be possible that while the number of CTA clicks is low, it’s still highly effective as it draws users’ attention to the text accompanying it.

5. Use your analytics tools to predict how key user subsegments would be affected by all of the causal paths.

Use more than one metric where applicable. For example, if the hypothesized causal pathway works through a reduced bounce rate, predict the effect on subsegment bounce rates and conversion rates. Continuing with the CTA example, a predicted effect could be that the higher mobile conversion rates would be approximately the same across all product listing categories.

6. Use your analytics tools to look for evidence contradicting the effects predicted in the previous step.

Use statistical estimation (p-values, confidence intervals) with the relevant multiple testing corrections (Bonferroni or Sidak) to separate meaningful differences from random noise. Any evidence that just a few of the subsegments drive most or all of the results is a reason for concern unless there is a plausible alternative explanation. The evidence against the initial claim is even stronger if the direction of the effect is reversed for a number of the subsegments. Continuing with the CTA example, analyze the observed superiority of mobile conversion rates to find out if it holds across all product listing categories or is more pronounced in just one or two of them. Do the same for any other relevant metric.

7. Use analytics tools to predict what the absence of all hypothesized causal paths would mean for user segments not affected by them.

Continuing with the CTA example, if the observed superiority of mobile conversion rates is due to an element (the floating CTA) present only on product listing pages and there is no equivalent on product detail pages or the homepage, then mobile conversion rate superiority over desktop should be lower or nonexistent on those pages to the extent of the hypothesized effect size.

8. Combine the results from the above steps to look for evidence contradicting the prediction in the previous step.

Continuing the CTA example, if it turns out that the homepage and product detail pages also have higher mobile conversion rates compared to their desktop equivalents, then either the effect is not due to the hypothesized mechanism or its significance should be revisited.

9. Draw a conclusion: the hypothesis has passed all tests and therefore has greater likelihood of being true; or it has failed one or more crucial tests and has to be abandoned.

If you failed to find instances in which the predictions based on the hypothesized causal paths did not match the data, then the hypothesis has survived the screening and controlled experiments are more likely to be successful. If the evidence contradicted the hypothesized effect(s) in one or more cases with plausible explanations, then abandon the hypothesis and find an alternative explanation for the observations that you based the initial claim on.