How to Choose Between One-Sided and Two-Sided Tests: A Complete Explanation
By Atticus Li, Head of Conversion Rate Optimization & UX at NRG Energy
Understanding the "Default Test Type" Mistake
Many experimenters fall into the trap of always using one-sided tests because they reach statistical significance faster (requiring approximately 20% less sample size). This approach is statistically inappropriate and can lead to serious business mistakes.
Let me elaborate on how you should select test types based on your hypothesis structure and risk tolerance:
Hypothesis-Based Selection
Your test type should directly match the nature of your hypothesis:
Directional Hypotheses
A directional hypothesis makes a specific prediction about the direction of effect:
"Adding testimonials will increase conversion rate"
"Simplifying the checkout flow will reduce abandonment"
"The new pricing display will improve average order value"
For truly directional hypotheses, one-sided tests are appropriate because:
You're only interested in detecting effects in one specific direction
You've already established a theoretical or evidence-based reason to expect change in that direction
You would only implement the change if it shows improvement in that direction
Non-Directional Hypotheses
A non-directional hypothesis simply predicts a difference without specifying direction:
"Changing the navigation menu will affect user engagement"
"The new product recommendation algorithm will impact conversion rate"
"Redesigning the dashboard will change user retention"
For non-directional hypotheses, two-sided tests are required because:
You're genuinely uncertain about which direction the effect might go
You need to detect significant effects in either direction
The implementation decision might depend on detecting effects in either direction
Risk Tolerance as a Decision Factor
Your organization's risk tolerance should heavily influence your test type selection:
Low Risk Tolerance Scenarios
Use two-sided tests when:
Testing changes to core business functionality where negative impacts would be costly
Evaluating features that could affect brand perception or user trust
Running tests on high-traffic/high-value pages where mistakes have large consequences
Testing with small sample sizes where you need maximum confidence in results
Moderate Risk Tolerance Scenarios
The decision becomes more nuanced:
For incremental changes with strong directional evidence: one-sided may be appropriate
For more significant changes where you'd still want to know about negative effects: two-sided
Higher Risk Tolerance Scenarios
One-sided tests might be appropriate when:
Running exploratory tests where you're only looking for potential wins
Testing minor UI changes that are unlikely to harm the experience
When you have very limited traffic and need to maximize statistical power
When you're following up on previously successful tests with refinements
Practical Decision Framework
Here's a practical framework to decide which test to use:
Start with your hypothesis - Is it genuinely directional with strong prior evidence?
If YES → continue to question 2
If NO → use a two-sided test
Consider implementation criteria - Would you only implement if there's improvement?
If YES → continue to question 3
If NO → use a two-sided test
Evaluate downside risk - How important is it to detect negative impacts?
If VERY IMPORTANT → use a two-sided test
If LESS CRITICAL → continue to question 4
Assess communication context - Will you need to defend results to skeptical stakeholders?
If YES → use a two-sided test (more conservative)
If NO → a one-sided test may be appropriate
Real-World Example
Scenario: You're testing a new product recommendation algorithm.
Hypothesis Analysis:
If your hypothesis is "The new algorithm will increase conversion rate" based on strong prior data → potentially one-sided
If your hypothesis is "The new algorithm will affect user behavior" without strong directional evidence → two-sided
Risk Assessment:
If a potential decrease in conversions would be extremely costly → two-sided
If you're primarily exploring new approaches and would only implement with positive results → potentially one-sided
Implementation Decision:
If you would only implement with a conversion increase → potentially one-sided
If you might implement even with mixed results (e.g., conversions slightly down but AOV up) → two-sided
Best Practice for Documentation
Always document your test type decision before running the experiment:
Explicitly state your hypothesis and its direction
Document your reasoning for choosing one-sided or two-sided
Specify the metrics that will determine success
Note your required significance level (typically p < 0.05)
This documentation protects against the temptation to change test types after seeing initial results, which is a form of p-hacking and invalidates your statistical inference.
Remember: The goal isn't to reach statistical significance faster—it's to make the right business decisions based on valid statistical inference.