Top 3 reasons you shouldn’t run an A/B Test

How many times have you heard a colleague say, ” Let’s test it! “?

If you work anywhere in tech, you probably lost count at 50.

But is that a bad thing? Tests are scientific miracle workers, right?

Observe a problem. Make a hypothesis. Run a test. Review results. Launch the winner.

It checks all the right boxes:

Low-risk implementation
Data-driven decision making
Culture of learning

Plus, it sounds smart and safeguards reputation if things go wrong.

But here’s the kicker: those same checkmarks can be anti-growth for startups. They can slow progress towards product/market fit, delightful UX, and sustainable business growth.

Disclaimer: This perspective is targeted for high-growth product teams at $1-30M SaaS startups. Speed, scope, and risk-tolerance for other teams, company sizes, and business models may differ.

Here’s the top 3 reasons why you shouldn’t run an A/B test:

The cost of running > cost of failing
You’re seeking data answers to non-data questions
It’s the de-facto company culture

1. The cost of running > cost of failing

Tests are like marathons – we significantly underestimate the cost to run them.

The cost of a marathon is $200. Plus, the cost to get to/from the race. Plus the cost of shoes and a fancy water vest. Plus the cost of incremental food from 4 months of extra calorie burning. Plus the opportunity cost of not training for something else. Hmm…

Did I say $200? I meant $2,000 and a 4-month training regimen with a lot less Netflix.

It’s the same with A/B tests. The costs we consider are only the tip of the iceberg.

When considering all costs incremental to creating real user value, each product test may cost $5-20k in salaries and resources to plan, execute, evaluate, and scale. And that doesn’t even include the most costly part: opportunity cost.

Opportunity cost, particularly in the form of distractions, can steer an entire startup off-course. Things like not talking to customers, not focusing on real value creation, and not effectively managing multiple cohort experiences at once.

Now, you definitely shouldn’t stop running tests. Nor should you shy away from marathons.

But next time a colleague proposes a new test, simply ask yourself: Is an A/B test the right path forward?

It’s a non-obvious choice, especially because the other side of the equation is just as elusive.

We significantly overestimate the cost of failing, because it’s akin to personal failure.

And fear makes us think irrationally.

“Failure could cost me my reputation.”

“Failure can’t be reverted.”

“Failure is a bad thing.”

All of these are incorrect (except for extreme examples).

Instead, the cost of failing should be considered through these 3 questions:

Financial cost – Can we tolerate the worst-case $ scenario until it can be addressed next?
User cost – Can we tolerate the worst-case UX scenario until it can be addressed next?
Irreversible cost – Can we efficiently recognize and revert failure without an A/B test?

To better evaluate these costs, invest in forecast modeling and conversations with customers. These two areas can tighten your confidence interval on estimations and decrease your reliance on tests to make decisions.

2. You’re seeking data answers to non-data questions

Tests give number answers.

Numbers are good for scale and comparisons. Numbers are not good for sentiment and strategy.

If you’re trying to answer questions like:

Is this redesigned homepage the best first impression for visitors?
Should this feature be included in the middle or upper tier package?

…these are typically not good test candidates. They require deep customer understanding and strategic positioning that can’t be evaluated properly through numbers.

Take the first question: a new redesigned homepage vs. an existing homepage.

There’s a lot to learn in a change like this: emotional response, persona identification, value prop understanding, cognitive load, SEO impact, etc. If all you looked at was two weeks of: bounce rate, dwell time, scrolls, click-throughs, and conversion, you’d only know part of the story.

For example, a rise in bounce rate and drop in conversion may be a good thing if driven by low quality leads – the whole point of redesigning the homepage in the first place.

But to the contrary, questions like:

How does price affect elasticity?
Where should this feature live to get the most clicks?

…these are typically good test candidates. Scale and comparisons are the exact answers you’re looking for.

3. It’s the de-facto company culture

The rise in testing has generally been a great way for product teams to be more data-informed. But as testing culture sits on the tipping point of active choice vs. de-facto process – it can do more long-term harm than good.

Three questions you can ask to evaluate your position:

Question 1: Is my motivation for testing rooted in fear or efficient learning?

Before answering this, have you ever thought to yourself: “Normally I wouldn’t test something like this, but if it goes wrong I don’t want it to blow up on me or the team.”

If you have, you’re not alone. And you should address the cultural reality of this first. Move at the speed of trust, not at the speed of tests.

Question 2: Am I testing for innovation or optimization?

9 times out of 10, tests are run for optimization – improving a metric or experience through incremental change.

This isn’t a bad thing, but consider it in the context of your entire product team’s priorities.

For SaaS companies <$10M, a 90/10 split between innovation and optimization is often necessary to give you a fighting chance for survival. For SaaS companies $10-30M, a slightly less aggressive 75/25 split is a common baseline to keep your innovation edge strong.

While these splits vary across industries and ambitions, they can be useful markers to see if your team is extending the company’s optimization efforts over the desired balance.

Question 3: Is my team prepared to run tests effectively?

The biggest hidden cost of testing is misinterpretation.

Misinterpreting test results, or misinterpreting the requirements needed to run an effective test.

A subscription to Optimizely or Amplitude does not qualify a team to run tests. That’s like giving a hammer to a friend and declaring them ready to build furniture. There’s a lot more tools and skillsets necessary to be successful.

Testing is a muscle. With repeated, intentional training, a team’s testing muscle gets stronger. It’s okay to start early and feel weak in some areas (that’s how all muscle building starts), but it’s also important to be self-aware and team-aware along the way.

To help build awareness, talk openly about gaps, see how other companies’ growth teams approach similar problems, and don’t default to something just because it’s normal.

Main Takeaway

The main takeaway is: A/B tests are powerful, but can be anti-growth if you don’t consider the cost that goes into them and the type of learning you want out of them.

They aren’t free, they don’t solve for customer empathy, and they should not be the de-facto path for learning.