Discussion about this post

User's avatar
Swag Valance's avatar

Who doesn't love a good paper prototyping story? :D

If I may offer some extra color for your Core Theorem... In experimentation parlance (call it "A/B test" wisdom for simplicity): the acceptable cost of learning should be proportional to the risks of failure.

The traditional A/B test with its dependence on test power and statistical significance is an extremely robust measure of assurance. This is why we use it for medical interventions such as vaccines. But that level of proof can be too onerous depending on the situation. It makes sense for service features you expect to last many months, but what if it's merely weekly editorial that vanishes in 7 days?

Contrast A/B tests with Multi-Armed Bandits, or MABs, where a machine learning algorithm optimizes for what is the "winner" in the moment. In MAB tests, statistical significance is thrown out the window and a local minimum or local maximum optimization is preferred instead.

This is a reasonable approach when a) the investment in statsig is too long for the anticipated lifetime of the change, and b) the risk of being wrong is acceptable and capped due to the shorter planned lifetime of the change.

No posts

Ready for more?