The confidence rating displayed in the A/B Test Center indicates whether or not your challenger variant has achieved statistically significant results. For those of you who are interested, this is a Chi Square test for significance. Most times, a confidence rating of 95% or better is sufficient to make a decision to either promote the variant to champion if it has the highest conversion rate, or to discard it if it's conversion rate is less than your current champion.
Another way to think of the confidence rating is that it indicates how often, if you repeated the same experiment, you could expect to get differing results. If you achieve a 90% confidence rating, there's a 1 in 10 chance that if you ran the same test again, you might get different results. With 95% confidence your chances are 1 in 20, and with 99% confidence they're 1 in 100.
To make sense of all this, think of your testing activities as managing an investment portfolio. Let's say you ran ten tests, and the average conversion rate lift of those ten tests was 50%. So, that might be taking a 10% conversion rate to 15%. If all of those 10 tests achieved about 90% significance, you could reasonably expect that in 1 of those 10 tests, you didn't actually find the best performing page. Now, if you're getting 50% returns on your testing investment, you have a hefty margin to absorb the possibility of being wrong that 1 time out of 10. However, let's say you were only getting a 5% average lift. In that case, being wrong 1 out of 10 times would almost wipe out your overall portfolio returns.
So, be aware of how your overall testing activities are performing, and choose a significance level that's right for you.