Hold-Out Groups: Gold Standard for Testing—or False Idol?
You regularly run A/B tests on the design of a pop-up. You have a process, implement it correctly, find statistically significant winners, and roll out winning versions sitewide.
Your tests answer every question except one: Is the winning version still better than never having shown a pop-up?