Computing confidence for multivariant tests

#1
We'd like to compute an overall confidence value for a test with multiple variants (sometimes with no control).

We've used Student's and Welch's T-Test for A/B tests in the past, but these multi-way tests don't seem to fit the bill. What sort of formula should we be using to compute %confidence for a 3, 4, or 5-way test?

Here's an example test we could run with no obvious control:
Grab every person walking by you on the street and offer them a single fruit. The fruit could be an apple, orange, or pear. If the person is wearing a hat and they would have been offered an orange, they are not allowed to participate in the test. Otherwise, record the fruit type and whether they accepted it.
 
#4
Okay. Put another way:
I'm looking to compute a statistically significant winner for a test with multiple variants and no clear control group. I suppose we could just pick a control, but it would be arbitrary.
 
#6
A control is required for T-Tests to determine statistical significance. I'm looking for another test or algorithm that can test for significance for multiple variants at once.

The fruit analogy is actually very close to the idea of one of our tests:

In a world where nobody is being offered free fruit, figure out which fruit people are most likely to take if given just one choice. There are three kinds of fruit: Apple, orange, and pear. Randomly pick a fruit out of a black box and offer it to a person on the street. Tally the fruit type and whether they accepted it. Use statistics to determine when you have a statistically significant winner.

Disregard the hat stuff in the first post, that's another level of complication we don't need to explore just yet.
 

CB

Super Moderator
#7
In a world where nobody is being offered free fruit, figure out which fruit people are most likely to take if given just one choice. There are three kinds of fruit: Apple, orange, and pear. Randomly pick a fruit out of a black box and offer it to a person on the street. Tally the fruit type and whether they accepted it. Use statistics to determine when you have a statistically significant winner.
It sounds like you have a categorical independent variable (fruit type) and a categorical dependent variable (accepted/not). So you could use a chi-square test or a fisher's exact test to see if the acceptance proportion significantly differs across fruit type.

But that tests a null hypothesis that's obviously false (of course people like some types of fruit better than others).

Is this really the scenario you're interested in though? Explaining the actual problem and research question at hand instead of dealing in hypotheticals usually works better....
 

BGM

TS Contributor
#8
As suggested OP may looking for testing the multinomial counts with the hypothesis \(p_i \geq p_j, \forall j \neq i \)
But lets wait OP to clarify this first before we move on.
 
#9
You have correctly characterized our test.

To answer your question, here's the actual scenario:
Given a stream of people passing through an e-commerce funnel, offer them one of several potential upsells and see which one performs best.
 

Dason

Ambassador to the humans
#10
As suggested OP may looking for testing the multinomial counts with the hypothesis \(p_i \geq p_j, \forall j \neq i \)
But lets wait OP to clarify this first before we move on.
With the stipulation that the \(i\) for that hypothesis is chosen based on the data itself. So it's more like
\(H_o: p_{(k)} = p_{(k-1)}\)
\(H_a: p_{(k)} \geq p_{(k-1)}\)
where \(p_{(k)}\) is the mth highest value out of the \(k\) parameters \(p_i\)s
 

hlsmith

Less is more. Stay pure. Stay poor.
#11
Why wouldn't chi-square or Fisher's exact work? 15% people offer apple accepted, 20% accepted pare, and 30% accepted orange. This is it right? If test significant perform pairwise comparisons to determine differences.