Lies Your Optimization Guru Told You

Lies Your Optimization Guru Told You

Before you get out your pitchforks, I want to stress that this article does not represent Peep’s views.

The easiest lies to believe are the ones we want to be true, and nothing speaks to us more than validation of the work we are doing or what we already believe.  Due to this we become naturally defensive when someone challenges that world view.

The “truth” is that there is no single state of truth and that all actions, disciplines, and behaviors can and should be evaluated for growth opportunities.  It doesn’t matter if we are designers, optimizers, product managers, marketers, executives, or engineers, we all come from our own disciplines and will naturally defend to the death if we feel threatened even in the face of overwhelming evidence.

Is how you are doing things the optional way? What if it’s not?

It is important instead that we challenge many commonly held beliefs and that we face ask if what we have been doing is of actual value or just easy to do?  We should instead ask if the results we getting are really good?  Or are we just not getting our faults pointed out to us?

These questions are true no matter what discipline we are discussing, be it design or optimization.  We can all get better and we all need to become disconnected with the stories we have told ourselves and each other if we really want to improve.

With that in mind I want to tackle some of the most common “half-truths” and flat out lies that permeate the optimization discipline.  In truth because I have spent my career trying to fight many of these myths and working to turn around programs that have fallen prey to the most common forms of bad advice, this is where I spend so much of my time this list could be nearly infinite, but I want to point out the absurdity of the most common lies that get used to justify past results and future actions.

The Many Lies of Optimization Gurus

You may now gather your pitchforks and torches…

mob

It is Ok that only X Percent of Your Tests Have a Clear Winner

No, No, No, No, No.

Yes, you can get value from the things that didn’t work, and in fact you often get far more value from the patterns of things that don’t work then those that do work.  That statement however is about individual options, not the larger tests themselves.

You should never ever accept a non-winning test as a good thing and you should be doing everything to make sure you are shooting for 100% of your tests to produce a clear actionable valid winner.  If you are doing the things that really matter with testing, it is actually hard to have failed tests, as the entire system is designed to maximize outcomes and not just ideas.

I can tell you that in over 5 years I have had exactly 6 tests that did not provide a clear winner and I am still pissed to this day about those 6.  Now I do things different in that I focus on avoiding opinions and maximize the number of things I can compare, which means that I probably have the largest number of failed experiences than anyone else, but in the end the tests provide a positive outcome and what won or didn’t win is of the least importance, only the fact that something did.  I have broken them down and tried different ways to tackle those problems.  I think about those way more than I do the thousands of successful tests, and that is a good thing.

Accepting failing tests is part of what allows people to pretend they are having a successful testing program when they are in fact wasting everyone’s times and resources.

Every test that you run that does not have a clear and meaningful winner towards your organization’s bottom line screams that you have allowed biases to filter what you test and how you test.    It is a sign that you are just spinning your wheels and that you are making no effort to tackle the real problems that plague a program.

There is never a time that a failed test is acceptable and there is never a time when you should be ok with a 25%, or a 50%, or a 12.5% (the industry average) success rate on your tests.

Every test, positive or negative, is a chance for optimization.  Not just of the things on your site, but for your own practices and for your organization.

You Can Figure Out Why Something Won

Nothing can more quickly vex marketers, designers, executives and just human nature than why something won.  In many ways people will never believe that something won until they know why it was better.

People will turn to user feedback, or surveys, or user observations, user labs, or just their own intuition to divine the great reason why something happened.  This constant need to figure out why makes people go on great snipe hunts which result in wasted resources and almost false conclusions.  They do all of this despite the fact that there is overwhelming evidence that people have no clue why they do the actions they do.  In fact people do an action and then rationalize it afterwards.

Even worse is when people pretend that this augury provides additional value to the future.  In fact all it is doing is creating false constraints going forward.

8147791934_ede3e5135b_z

Image source

Your evidence is probably not evidence

You are creating rules without any actual data to back them up, otherwise known as the exact same mistake that designers will often make, and for the exact same reason, your own ego.  All those things that you are taking for evidence have so much error associated with them that they represent the opposite of actual knowledge, and yet we hold onto them like they are they are our life rafts.

The issue is not that you are right or wrong, you may very well be right.  The issue is that the evidence used to arrive at that conclusion is faulty and there is no evidence to support any conjecture you make.  You are creating a situation where you believe something to be true but there is no evidence and you are asking someone to prove its non existence.

That mental model you are creating based on what you think you know might be correct, but by pretending you have far more insight then you really do you are allowing yourself and others to eliminate options that you have no actual data to get rid of.  You have the exact same amount of information for those conclusions as you do for pretending that all your users will give your organization 5 million dollars as a donation 3 years in the future.  The only real difference is that one feels good and one does not.

One of the most important rules, and definitely the one that takes the most inforcing, when working with organizations is “no storytelling”.  It is not for me to say that you are right or wrong, but the very act of storytelling allows the perception of understanding when there is no evidential reason for it.

In the end it just doesn’t matter why and all efforts spent in that fruitless endeavor distract from the discipline of acting on data and on going where the data tells you to.  You don’t need to understand why to act, you just need to act.

“I Don’t Agree Because I Often See…”

Let’s ignore the problem with black swans and first person perspectives.  Let’s ignore the issues of self serving observer-expectancy bias.  Let’s ignore all the problems with false validation of hypothesis and stories (more on this to come).  Let’s take your stories at face value despite the overwhelming evidence that astrology has as much basis on providing value and context to whatever you are defending as the stories that you believe are true.

How many times did you purposely test to break whatever rule or heuristic you are trying to defend?  How often did you purposely test the opposite of the feedback or your belief to see if what the best answer was and not just validate your opinion?  It is easy to validate what you already believe when you refuse to look for evidence that might contradict it.

Challenge your ideas and cherished notions

So many of the things that optimizers hold dear is because they never took the effort to validate those things they most wanted to be true.  Even when they attempt to challenge an idea, it is only meaningful data when you you establish large numbers examples and know the limitations and error rate, then you are providing false knowledge to yourself and others.

That user feedback you are following seems to have provided you with a great idea for a test?  Did you test it against the exact opposite and other possibilities in a large range of outcomes?  What is the pattern of outcome, was it just a one time thing or can you establish that whatever the change is works in a large number of cases and consistently is the best, not just better then other tactics.  Have you taken the time to disprove alternative hypothesis?

What are the assumptions that went into that action as opposed to this one and and how often did you see that outcome?  It is easy to validate an opinion as long we never have intellectual curiosity, too bad this can only limit your results and negatively impact your organization.

Pay attention to the patterns

9726310510_48fdf709ab_z

Image credit

It is possible to derive patterns, but only from very large data sets and only in the broadest terms.  Patterns I usually see: Real Estate usually has a much higher beta and longer half life then copy changes.  Spatial changes tend to be better than contextual changes for long term monetary value.  Even in those cases you better be designing your efforts to see if that is true in this case and going forward.

In my current job I have seen copy have a much larger impact than most other places, much to the delight of our copywriter.  We only know this by constantly building tests which include multiple executions of these concepts and by measuring the long term impact and the cost of implementation; not from single test or by validating his or my own beliefs going in.  It is only by having looked at the impact of multiple forms of copy changes in comparison to all other types of changes (real estate, function, or presentation) and by comparing that with other sites and other changes.

I have run X number of test/tested for X number of Years, therefore I am an Optimization Expert

I have been driving for 18 years and have driven well over 250,000 miles and you know what all that experience tells me about driving F-1 cars?  Absolutely nothing.

That 10,000 hour rule sure sounds good in theory, but in reality is just a cover for common practice and complacency.  You can get really good at executing tests, but if you are not spending each moment actively trying to improve your program then there is no incremental gain from each test.  If you are not looking at how you can change your own behavior and your own beliefs in regards to improving results then no amount of time spent testing will do any good. The number of tests you run is meaningless towards how much time you spent actually trying to get better, or focusing on the much more complicated and uncomfortable world of changing what people want to do and getting them to accept that when they are wrong, and they always are, that they make so much more money.

So little of what makes a program successful has to do with how you run a test or even what specific test ideas you have.  It is all about changing how people think about their actions, about challenging assumptions, and about building the environment where rational decisions are made.

Keep growing, keep challenging yourself

mirrordirty

Image credit

Trying to impress people with years or numbers of tests is as much of a non-sequitur as people claiming the number of Call of Duty games they have played is a measure of their ability to handle a war.

In almost all cases if that is what you are hiding behind, or even worse some false award then it is a glaring flag that you have never done anything to improve your actual optimization efforts.

Real optimization is the act of optimizing your own actions, and is about the efficiency of improving all actions.  There is never a time when you have “made it” or are an “expert”, instead it is about the constant improving and challenging of yourself and others to get more then they would have otherwise.

My Test was Successful Because We Got X Result

To be fair sometimes this statement is of actual value, but only when it is given with the context of site and larger test.  A result by itself sounds good but is of absolutely no value in telling you if a test was successful or the real impact of the program and person executing that specific action.

While I am loath for stories here is a perfect example from a recent test we ran.  We were recently optimizing the second most trafficked part of our site, and run a large number of versions on the the page to see what mattered and to inform future actions.  As part of this the perceived favorite did grant a 8.5% increase to RPV, which is not bad. Most groups would take that and run…

Another experience, one that was chosen because it was similar to another site (not a competitor) had a 43% increase to RPV.  Now looking at the 8.5% increase and comparing it to the 43% increase makes that 8.5% look sad and pathetic compared to what we should be getting for the same amount of effort.  It is about building the context of your own situation and about measuring the efficiency in that context.

Never stop looking for a better alternative

balt

Now lets compare that to another experience in our test, one that was built off of the pattern of previous results and designed to challenge many commonly held beliefs about user actions.  That experience produced well over a 700% increase to user RPV.

Obviously you should be incredulous whenever you see a lift above 50% on an established site, and very large improvements have their own world of unique issues that most groups will not have to worry about very often, but in this case we retested multiple times to confirm the scale of impact.  Now that outcome makes those other 2 outcomes absolutely awful for the business as a whole as we would have been bleeding revenue by just going with them.  In fact future testing of that option showed an even greater result and has helped shape many other business opportunities going forward.

Each answer is an outcome, but the actual value of them is only judged in context, not by some arbitrary value.  What would have happened if I had gotten a 7000% increase?  What did I fail to do that might have generated that outcome?

The specific outcome of an expereince is irrelevant.  You only know what the real impact of your efforts are when you look at a large veriety of possible outcomes and maximize the range of things you compare.  In some cases getting a 8% lift is amazing, and sometimes it would be a devastating loss to the organization.  You will ever only know what that lift means when you establish the context.

An outcome is great.  An outcome in light of the range of feasible outcomes and in light of resources is even better.

You need a hypothesis

I am going to just leave this as everything associated with how most people understand the concept of hypothesis or even scientific method.  No single item is trumpeted more than hypothesis and no single item causes more destruction in terms of getting results then the use of these vile things.

Now hypothesis has many definitions, but the most common way people think of it is in terms of, “If I do X, then I expect to see Y result”.

In reality the actual concept of a hypothesis is not positive or negative, but the way that people hold onto them as a way to validate their actions is destructive.

Once you get past 6th grade science you will understand that a hypothesis is just a small part of a much larger and richer scientific discipline.  It is designed for use in certain cases and to allow for repeatable conclusions which, and this is the very important part, validate that outcome against ALL OTHER POSSIBLE ALTERNATIVE HYPOTHESIS.   Hypothesis testing has come under consistent and larger attack as time has gone on.

Not only is it not designed for efficiency or for actual business problems, but it in itself is used  in conjunction with many other techniques and controls, all of which are commonly ignored by the false cognoscente of the optimization world.

Don’t confuse discovery with validation

Even worse is when the direct pursuit of the hypothesis limits the scope of testing or allows people to believe that they have discovered an answer when they have not.  These are the times when optimization moves away from discovery or value and instead enters the realm of validation, otherwise known as the mortal enemy of results.

Where it is easy to think in terms of what you think will happen I want to challenge you to think differently instead.  Instead of worrying at all about what you think will happen, start by simply listing all the feasible ways you could interact with the page/element/experience that you are trying to optimize.

Think about all the ways that you can execute those ideas, and then choose the largest range you can from that list, include your favorite if you like, and then make that your test.  Even better is when you get others to add their input as they might think a different tactic is “better”.  Testing should be the most inclusive thing you can do, since all ideas go through the same process and you are trying to maximize the amount of things you compare.

It is perfectly OK to have a belief on how things work.  Everyone has their own mental model when it comes to what are problems and how to fix them.  The issue becomes when you allow some false structure or that belief in what will work to limit what you test or what conclusions you derive from outcomes.  Don’t allow an opinion, yours or anyone elses, to be the limiter on what you test or even how you test.

Conclusion

The list of all the common lies could go on forever.  Optimizers in many ways are the most at risk of bias because they have so much access to data and they control the way that people look at data.  While we don’t have time to dive into other common pitfalls like:

– rest assured that those are just as negative to actual outcomes as the items that I did address directly.

Always keep going

moving

In the end there is no good or bad, only the efficiency of different tactics and the consistent and continued attempt to become more and more efficient.  No matter what tactics you use you are always going to get a result, the question is if that’s the best result you could and should have gotten?  It is only be challenging every part of how you think and act on optimization that you can start to get new and improved results.  Every day allows an opportunity to do just that, as well as one to run back to that which is most familiar and comfortable.  Each of these items and so many more limit the efficiency of not just specific actions but also entire programs.  The largest problem is that the more comfortable something seems the less likely it is to be of high value.

The goal here is to open up for debate the most commonly repeated pieces of advice in the industry.  I can guarantee you that every single part of my programs can and should be fixed and improved, just as I can guarantee you that every part of your program is the same.

That is the best news possible as it means that every day and every action is a chance to learn something new and to grant better information of the world around us.  The only way to accomplish that however is to let go of so many past beliefs and to purposely challenge everything you hold dear.  Just like the act of testing, if you aren’t challenging what you hold true how can you expect others to follow suit?

Occom’s Razor should tell you that the more people are talking about a practice the less likely it is to be valuable and the more likely it is to be efficient.  Skipping game theory, basic human nature tells us we gravitate towards that which already meets our own thoughts and patterns and look for confirmation instead of conflict with prior beliefs.  Fight that urge and your program and efforts can have magnitudes greater outcomes.

“Whenever you find yourself on the side of the majority, it is time to pause and reflect.” – Mark Twain

Join the Conversation Add Your Comment

  1. Hey Andrew, no doubt about there existing lot of misinformation and downright quackery in this business. Based on the premise of this article and what I picked up from it, I want to agree with you.

    However, I’m having a difficult time following your points and figuring out exactly what you’re saying. As just one example of several, I can’t make sense of this line: “In reality the actual concept of a hypothesis is not positive or negative, but the way that people hold onto them as a way to validate their actions is destructive.”

    I think some real-world examples and/or illustrations would have really helped get your point(s) across.

    1. Grigoriy,
      Think about it this way, a hypothesis is just a statement that you believe that there is a specific cause and effect for a certain action. It doesn’t mean there is that cause and effect, it doesn’t mean that even if that effect happens it happened because of the specific reason you believed it would. In fact, your belief in that statement in no way has a bearing on the real world situation, it is simply a statement you make about a belief you have. In that way a hypothesis is just a thing, it has no positive or negative connotation and it certainly has no power over what will actually happen.

      What does have power over what can be observed and even more so over what people will take away from what is observed is the act of putting too much faith in that belief and by not allowing for an outcome outside of your own specific and limited world view. When it becomes the filter by which you judge your own actions, then it can be incredibly dangerous.

      Real world example. Me saying “I can fly” is just a meaningless statement right up until the moment that I jump off the side of the building…

    2. Fair – but so what do you call your treatments instead? Assumptions of what might work better? Alternatives?

      You ARE testing stuff that you think will perform better, not worse. So isn’t that a hypothesis no matter what?

    3. While I try to get as many inputs from others from what they think will perform better, I try my best to make sure that is not what determines what gets into a test. The goal is to have the highest quality and highest beta of options. I try to ensure that there is a large enough range and that things are included that challenge the where everyone else or the current site goes. I include those not because I think they will win, but because I know my opinion, like everyone else’s, has no bearing on the actual outcome.

      It is shocking the number of times that the option that was added specifically for this reason will win.

      So everything is a variant or an option, nothing more and nothing less. I have a thought on what is going on and what I hope will win, but the key is to not let that get in the way of the test.

    4. Peep Laja

      How many variations do you typically test against Control (granted that you have a lot of traffic on that site)?

    5. Every test is different and every site is different but I try to maximize things as much as possible. Here is an example. Next week we will be launching a test on our most trafficked page. We are doing a full page so we started with 14 ideas of what was possible. We did quick mocks on those ideas and then did a review. We killed 3, I added 1 more for the reasons said above. and then we built those. We have now killed 1 other one, so that means we have 11 plus control. A few are similar to each other but the spread is as large as I could get with the resources I had.

      In the case of the example test there were 9 experiences in that test (which was the second in the series). The original had 10.

  2. Are the editors on Spring Break?

  3. I have no issue with any of the points made – BUT – the article is just super meta stuff.

    It lacks just a few (or many) “for practical application, this is how you diagnose you have issues 1, 2, 3, and here’s some remedies to fix them”.

    I also wasnt a huge fan of a random test example boosting 700% something – with no specifics, details or screenshots provided. Talk about a useless brag!
    Can’t get any more vague than that.

    The whole article is like a rant trying to deliver some points, but lacks any action-ability. Is it that the writer is unable to communicate, or does he just rather preach without being helpful?

    1. Jackson,
      I very much appreciate your feedback. As for your two points:

      1) I really wish I could just show screenshots and full details on the test but the problem is that my organization is ok with me talking in generalities but not specifics. The test itself was attacking our second most trafficked page where there was an automated interaction before we started. We tested out a number of different forms of interaction and ended up on a type of interstitial that did not negatively impact the users ability to accomplish the main task. From there we tested out different ways of interacting with them and arrived a specific type of available action which blew everything out of the water. Because of the scale of this page (around 150K users a day) this resulted in a massive spike in revenue.

      I sincerely apologize for not being able to give more details then that, or the specific amount of lift. If you know me then you would realize how much I despise the act of bragging but I had to include something that illustrated the point and that is a great example of the scale of possible outcomes. It is not normal, but seeing a 2, a 4, a 8, a 12, and a 20% increase in a single test is not that uncommon. I just did not have a good example that I could either talk about or that would illustrate the point as quickly.

      2) The two possible outcomes on the writing: A little bit of both I guess.

      Because the amount of topics covered I can’t get super specific, and the point of the article was to point out the common held beliefs. I attempted to link in each point either a blog I had written specific to it with direct details on dealing with it and/or multiple sources on the subject to help someone get the details they need.

      If you have a specific question on any of the points please either leave a comment or reach out directly to me or Peep and we will answer to the best of our ability.

  4. Hey Andrew, thanks for an awesome – and really *thoughtful* post. We run a lot of experiments, and recently had a very surprising result, where we had an “apparent” winner, that on deeper analysis was a false positive. I wrote about it a short case study here (blog.popcornmetrics.com/the-no1-mistake-in-website-conversion-funnel-optimization/), and it might give your readers a real life example of where its easy to take the wrong result from an optimization experiment, and how to avoid that danger. Curious on your views. Cheers, Paul

  5. I usually love articles that tell me a bit about what best practices are, what I should be doing, how I may be getting some of my practices wrong.

    This doesn’t do that for me. This is like … a bit of a rant mixed with some info, mixed with more rant. It feels like Conversion industry catharsis and comes off as “if you aren’t measuring what I’m measuring, you’re wrong. If you aren’t doing it how I’m doing it, you’re wrong. If you say it this way or that way, you’re wrong.

    Pick one of these. That’s your blog post. ANY one of these is a full post itself. If I had to do a one-line summary it would be “You other conversion people aren’t doing it my way and you’re very, very wrong.”

    Not sure that helps much in the end. Maybe it helps full time CRO people – I’ll ask a couple – but for me, I’m glad this is a guest post.

    1. Matt,
      Thank you for your feedback. I agree that all of these are at least 1 article and in some cases many many articles. The goal of the article was to highlight many top level things that a lot of the CRO community keeps claiming are positive practices that in truth only limit or severely hurt organizations who follow that advice.

      I was hoping that it would be more valuable for those that are not full time CRO as to highlight that what they are hearing may not actually be good for them, instead of just going with the crowds. The truth is that CRO is about challenging assumptions and to get the most consistent and positive results it requires people to think about problems in a way that is not familiar to just about any other discipline. Way too many groups end up just running tests based on their existing disciplines and never realize just how far the gap is between that and high level optimization.

      I call myself the most passionate person in the world about AB Testing, and that can come through in both positive and negative ways. The writing and the subject matter may turn as many people off as it does help open up some people’s eyes to the large ignored issues. I apologize to you and to any that may be turned off, but honestly if even 1 person stops propagating any of these practices I consider the article a success.

  6. Andrew, I have a question that just came up with my boss that I think falls into the “Challenge your ideas and cherished notions” in this post.

    He says every web page should be written for SEO. My feeling is if you are building a page designed to convert from traffic driven by a source such as referral links, it should be built to convert humans not spiders. Am I missing something? Would greatly appreciate your thoughts. Thank you.

    1. Kevin,
      The first question is what does design for SEO mean? I can easily code the page for SEO and that has little to nothing to do with what actual user experience.

      The second thing is that both of you are arguing about a mental model. Just because someone thinks a page is an “SEO” page or a page designed for referral sources, it doesn’t actually make it so. That is just the thought process the person was focusing on when they designed the page. Getting into a discussion about one concept versus the other is damaging to both outcomes and your testing. Just accept it as one idea versus another and see what the data tells you.

      Here is how I would deal with the situation:

      1) Choose your single success metric (hint: Revenue per Visitor)
      2) Design 2 different “SEO” pages, as different from each other as possible
      3) Design 2 different referral pages, as different from each other as possible
      4) Ask a random person outside of the conversation what they think would work, design that page
      5) Look at the 5 new pages and the control and figure out what the common characteristics are. Design a page that challenges those.
      6) Test all 7 pages (6 new + control) against only your single success metric.

      This will allow you to maximize outcomes and avoid a my idea vs. yours situation.

  7. Thank you for your well thought out answer Andrew. In using the term “designed” I mean “written for” – I assume your answer would still apply. My boss believes that if you write for SEO then it naturally will be a better performing page but as I see it writing copy for SEO on a conversion page will not guarantee a better performing page. I think your answer of testing the two based on the desired metric, would be appropriate still.

    1. Kevin,
      The only thing I would add to my answer based on your feedback would be that I would strongly suggest that you don’t get caught up on copy, as in most cases it is actually the least influential type of change you can make. I would suggest you challenge that assumption as you build out your test.

      Another note is that you need to avoid a one idea vs. another scenario, not just politically but because it serves no purpose other then ego fulfillment. It is never about one idea vs. another and any 2 experience test should be viewed as a waste of time and resources. How many experiences can you make, not focusing on one persons opinion versus another (as I outlined above) but created to make as large a range as possible. It doesn’t matter who was “right”, it only matters what performs the best.

  8. Well…you wanted to get people’s attention. You did it Andrew. The only problem is, this is the type of click-bait that makes our jobs an uphill battle everyday because we have to try to talk our clients down from the ledge after they read this crap. We industry people can read through your alarmist double talk, but the people who come to us for guidance and help with their sites only read, “You’re being lied to everyday!!”. Challenging assumptions and accountability for results is absolutely necessary, but you are certainly not doing anything to help instill confidence in our discipline with this article.

    I look forward to the “double-talk shuffle” in your forthcoming reply to this comment.

    1. Ron,
      Thank you for your thoughts. I have no need to do the “double-talk shuffle” you obviously believe one way, I believe another, that is what is great about putting ones thoughts out there.

      I will say this, most organizations would be better off not doing anything then follow at a minimum 75%-90% of the stuff put out there in optimization space (and often the most repeated and liked stuff). If this type of article makes it more difficult for people to continue to push these practices, then I could not be happier. If people are not going to think about what I am proposing and instead dismiss it because it makes their lives harder, then that says far more about them then the content.

    2. Hey Andrew-

      I do agree that debate is healthy and needed. Thanks for getting the conversation started.

      I agree with this: “…most organizations would be better off not doing anything then follow at a minimum 75%-90% of the stuff put out there in optimization space (and often the most repeated and liked stuff).” The problem being that 75%-90% of the solutions people devise are fads that are focused on slick graphics or functions, but don’t address the actual problems and pain points that site visitors are trying to overcome.

      Again, while I may agree with some of your assertions, it’s hard for me to get past your alarmist delivery that erodes the clients’ trust in our discipline.

    3. Ron,
      Your note about the writing style is appreciated. I would say it appears where we differ most is that I think 75-90% of the problems come from people using optimization as a platform to validate their opinion vs. someone else’s opinion about what is a “good” user experience. Both sides don’t realize they are having a teapot argument (http://rationalwiki.org/wiki/Russell%27s_Teapot) where they both just use what is handy to try and rationalize their opinion.

      When people get past that issue and focus on maximizing outcomes, outside of random opinions, that is where real optimization happens. If my writing helps or hurts people to get to that point, then that is most definitely open to interpretation and I greatly appreciate everyone who has commented or added to that conversation.

  9. Part 1: ‘a test without a clear result is unacceptable’

    Well, good for you.

    It is acceptable as long as the net result of conversion optimization is positive. It is preferable to minimize the % of non-clear-winner tests to maximize that net result.

    If you can bring it down to 6 in several years then I suppose you’re making millions of dollars every month (or you’re just not waiting for significant results ;) )

    Part 2: ‘you can not figure out why something worked’

    “The issue is not that you are right or wrong, you may very well be right. The issue is that the evidence used to arrive at that conclusion is faulty and there is no evidence to support any conjecture you make. ”

    If no one can figure out why something works then building yourself a framework that allows you to be right some of the time and gets you real-life results is a good solution.

    Part 3: You have to test everything (I don’t agree because I often see)

    Test everything sure. But you can’t test all at the same time. So a consultant may say ‘I don’t agree, because I often see this doesn’t work (in your industry for your audience) so let’s test something that is more likely to give us results’.

    Sure, the guy (or girl) doesn’t have a 100.000k sample size on that particular situation. Guess what, real world vs theory. Making the best decision possible given available information.

    (besides, how could you get a 100.000k sample size if you don’t do the experiments to see a pattern?)

    Part 4: You can only become a CRO expert if you constantly try to improve the way you work (not just by doing many experiments).

    Sherlock.

    Part 5: 43% improvement is a better result than 8.3% improvement.

    So you feel less happy than initially about your 8.3% improvement after you do an experiment that shows a 43% improvement. Oh really? Pity we can’t predict the future, then we would have done the 43% first.

    Part 6: You need a hypothesis.

    Sorry, 1st part is all over the place. I truly have no idea what you’re trying to say.

    2nd part says:

    “Even worse is when the direct pursuit of the hypothesis limits the scope of testing or allows people to believe that they have discovered an answer when they have not. These are the times when optimization moves away from discovery or value and instead enters the realm of validation,”

    If a you use a test/(non)validated hypothesis to ‘discover’ then I don’t see how that goes with your ‘you can’t know why something worked’ statement.

    Conclusion (the only thing that seems to make sense in your post):

    – Conversion optimization people are people
    – statistics is the biggest lie as they are to be interpreted
    – (and we use qualitative insights and pattern recognition) to do it.

    Old news. I can’t see the value of this article in any way.

    1. Peter,
      Thank you for your well thought out response.

      It seems to me that a lot of your feedback (certainly all points in relation to 1,2,4, and most of 5 and 3) is tied to some belief that just because you got a positive result that it is a good thing. The question was never if you could have gotten a result, if it is, then you are wasting time and resources for everyone. Any result or effort that is not given sufficient and as non-biased as possible evaluation is self-serving ego fulfillment.

      The question is did you maximize the result you can get given the resources that are available. Way too many people pretend that because someone was “happy” or did not call them out on it and that they can point to a number that is a valid argument as to the value of their work. Sorry, will never and can never buy that.

      The entire point is that if I have X resources, then I need to maximize Y, not just get a Y. This means that the focus is on discovery and exploitation of information (to simplify think in terms of the N-Armed Bandit Problem). That is why having a 8.3% return in and of itself is pointless, and why it is about both maximizing the likelihood of result as well as the scale of each result. Just validating some pointless opinion or storytelling is of negative value, and pretending you have derived some inference on why something happened is irrelevant to what did happen. The question is always to maximize outcomes and to do that you must challenge all assumptions.

      This also gets to your point about people and I hope helps explain why you missed my point completely on the hypothesis part. I think more then anything this is an ethical debate. Optimization can be a tool used for many different reasons, but it is almost always used as a form of self validation. It is used to maximize someones ability to persuade them to their way of thinking. While you view on that may depend on who you side with amongst Kant, Nietzsche, Heidegger, or choose your favorite philosopher; my view of it is very simple. Any effort, no matter the tactic, used towards validation is of the least utility to the whole or anyone but the self.

      The entire point of any effort is to maximize utility but to do that you must attack the least efficient part of any system, which are the people, or more importantly the biases of those people, as well as their defense mechanisms. This is by far what I have spent my public writing career on (and the reason why my blog is titled Quantatitve Dissonance) and where I think the largest gains can be had by people in industry or out when it comes to how they leverage optimization. Either it is used as a defense for pre-existing beliefs and disciplines, or it is used to maximize discovery and resource allocation in the direction of maximizing outcomes. In the second path the best thing to understand is that there are only two outcomes, in by far the least likely, you are right, you get the result you expected, and in all other cases where you are wrong you get a better result then you would have if you were right.

      There are many different tactics to achieve that desired goal, I am certainly not one to think that any specific tactic I use is the best (and in fact I believe that WAY too much time is caught up on tactics and not the discipline behind them) and I welcome any discussion that is focused on discussing how to maximize yield as opposed to just simply getting a result.

      I hope that gave enough context to fill in what appear to be several gaps. To you and all I understand that my writing may have lead to a lack of understanding, but I hope that you and the many others who have shared their views do so in hopes of working together to achieve greater yield and not to defend one tactic or another.

  10. Dramatic title with little substance.

  11. I am not an optimization guru….but these are the most lies most experts I have worked with in the past tell….Thanks

Comments are closed.