# Beauty Pageant Method of Analysis

#### JKru

##### New Member
So, I was watching a beauty pageant last night (Miss USA) and I thought about stats. (I know, what's wrong with me?)
What kind of analysis method would one use to determine if there is significance to the winner? That is, how would someone prove that it is more than just chance or "blind luck" to win? Furthermore, besides just the winner, would it be possible to show the significance of the top-x winners?
Here is what I was thinking: I can figure out the probability of the winner being selected from 51 contestants. I could also do some permutations to figure the probability for the top-x winners. But I was looking for more that just probability. I was thinking about using the results from the last 20 years (there have been 62 contests) to determine if there is more to the contest than just chance. But how would I analyze that? Use Kruskal-Wallis test or some other test?
Oh, what if there was a (virtually) unlimited number of contestants? For instance, randomly select 10 people from the US and those are your top-10 Miss USA winners.
It would be great to have some insight into this situation as it is related to something else I am working on.
Thank you.

#### CB

##### Super Moderator
What kind of analysis method would one use to determine if there is significance to the winner? That is, how would someone prove that it is more than just chance or "blind luck" to win?
We don't really "prove" things using data analysis. Proofs are in maths. Conclusions about the real world (even when based on statistical evidence) always carry some uncertainty.

A hypothesis that beauty contests are decided purely by random chance would imply that the rankings in different beauty contests are independent. So if you took a sample of women each of which were both contestants in two or more pageants, their rankings across pageants should be uncorrelated.

Buutttt to be pragmatic I think it's best to spend time on testing hypotheses that are plausible. Beauty pageant prizes are not allocated randomly.

#### JKru

##### New Member
Thank you CowboyBear for the reply.

Proofs are in maths.
I thought statistics was a math...
I was meaning "prove" in terms of a given level of significance.

A hypothesis that beauty contests are decided purely by random chance would imply that the rankings in different beauty contests are independent. So if you took a sample of women each of which were both contestants in two or more pageants, their rankings across pageants should be uncorrelated.
That is where I am not sure how one would do any kind of statistical analysis to show they are uncorrelated. Contestants rarely compete in the same pageant multiple years. Or, I should say, at the Miss USA level, the contestants were selected at lower pageants, and although they may compete in those year after year, rarely does one come up from those multiple years. Only two this year competed previously.
What if we were to ignore the actual contestants and just focus on the state? For instance, Alabama has placed for the fourth consecutive year, three other states for their third consecutive year, and three for their second consecutive. Connecticut, who won this year has never previously won.

Buutttt to be pragmatic I think it's best to spend time on testing hypotheses that are plausible. Beauty pageant prizes are not allocated randomly.
I am actually trying to be pragmatic and learn how to do a statistical analysis of rankings. There are many things that are ranked - the best songs, the greatest movies, the top cars, the best jobs, the prettiest woman in the world... How would a statistician show that a certain ranking system is (possibly) not random?
On this site, there are many threads of rankings where the population is set. Such as, of these 10 items, rank them from least to worst, or of these 10 items give it a value from 1 to 5. For these studies, the majority of experts suggest using Kruskal-Wallis, and I see how that would work.
However, I have not found anything where the population is quite large and one is selecting a limited number to rank. For instance, given 50 contestants, select the top 10 (and ignore the rest). If I have 5 people complete the ranking, I might have 5 completely different rankings. I'm stuck.
As a math teacher and someone who enjoys statistics, this is something that really has me intrigued and searching for answers. There must be some way that statistics can be used to show that any of these types of rankings are more than just chance.
Again thank you very much for your reply.

#### CB

##### Super Moderator
I thought statistics was a math...
I was meaning "prove" in terms of a given level of significance.
I guess statistics is a type of applied maths, yep. What I mean is that a mathematical proof is quite different to statistical evidence. A mathematical proof implies certain knowledge: given a set of starting axioms or assumptions, a proof uses deductive reasoning to show that a particular statement is definitely true.

In actual empirical research, we can't absolutely prove things about the world. E.g. a claim that a particular relationship between two variables is statistically "significant" means only that the relationship observed in the sample would be unlikely to occur if there was really no relationship between the variables in the population. So we might reject a null hypothesis of no relationship in the population. But there is always some chance that we could be wrong in doing so. It sounds like semantics, but it's actually an important distinction: empirical research does not absolutely prove hypotheses, it just provides evidence for or against them.

I am actually trying to be pragmatic and learn how to do a statistical analysis of rankings. There are many things that are ranked - the best songs, the greatest movies, the top cars, the best jobs, the prettiest woman in the world... How would a statistician show that a certain ranking system is (possibly) not random?
Ok, I see Really what you are talking about here seems to be reliability.

On this site, there are many threads of rankings where the population is set. Such as, of these 10 items, rank them from least to worst, or of these 10 items give it a value from 1 to 5. For these studies, the majority of experts suggest using Kruskal-Wallis, and I see how that would work.
Whether a Kruskal-Wallis test is appropriate depends on what you're actually trying to find out. A Kruskal-Wallis test looks at the relationship between a nominal/categorical independent variable and an ordinal/ranking dependent variable. You don't really have an independent variable so it wouldn't be helpful.

However, I have not found anything where the population is quite large and one is selecting a limited number to rank. For instance, given 50 contestants, select the top 10 (and ignore the rest). If I have 5 people complete the ranking, I might have 5 completely different rankings. I'm stuck.
Generating conclusions about a population based on a sample of data is arguably what stats is all about, so that aspect isn't necessarily a problem. If you were interested in how consistently a pair of observers rank a given set of objects (e.g., pageant contestants), you could use a weighted Cohen's kappa, which is a measure of inter-rater reliability. Evidence that kappa is greater than zero would imply that ratings are not made randomly.

If you wanted to extend this to the case of five observers, I'm not sure if it's possible to do that with a single statistic; maybe someone else knows of a measure of inter-rater reliability for 3+ observers and ordinal data?