descriptive statistics, easy question

WeeG

TS Contributor
#1
hello, I got a small question, I think I know the answer and just want to confirm it with you guys:

A researcher is interested in comparing the mean, median and mode of the number of board members in private companies, in compare to public companies. He takes a sample of n private companies and m public ones. The mean (average) of both is 8, and so is the median of both. Choose the right answer:

a. The two modes are also equal to 8.
b. The two modes are equal, but their value is not 8.
c. One mode is 8, and the other is not.
d. The two modes are different from one another, and none of them is 8.
e. There is not enough information to answer the question.

I think it's answer e, right ?
 
#3
The fact that the means equal the medians tells you something about the distributions of the data, and knowing the distributions can tell you something about the modes. I think the answer is A.
 

terzi

TS Contributor
#4
If the mean and the median have the same value, all we know is that the distribution is symmetrical. I'd say e) is the correct one, since we can't tell nothing about the mode of that distribution.

Not only the normal distribution is symmetrical, consider the case of a uniform distribution, symmetrical, but probably with many modes.

But WeeG, why did you choose e), by the way?
 
#5
The mode is the value that occurs most frequently in a distribution. A uniform distribution has no such value. Some people prefer to say that every value in a uniform distribution is a mode, but others say that a uniform distribution has no mode.

I was taught the "no mode" way, and most software will not return a meaningful mode value for a uniform distribution. If you pick any one or more values from a uniform distribution and say, "these are the modes," and then try to do meaningful further math with them, the results may be hard to justify. Saying that a uniform distribution has “multiple modes” can lead to a mathematical dead end, which is the same as saying that it has no mode.

From the OP's problem statement I believe we can assume it is given that a meaningful mode exists. Because the coincident mean and median does tell us that the distribution is symmetric (about the median), the mode must also be symmetric about the median. I can’t think of any other possibility in real numbers.

Perhaps there could be a distribution with two or more identical humps in symmetry, but in that case there would again be no value that occurs most frequently--no meaningful mode value.

To me, the OP's problem strongly indicates a Gaussian (normal) distribution, and in Gaussian distributions the mean, median and mode are the same value.

I stick with answer A.
 

terzi

TS Contributor
#6
You are saying that your data is normally distributed based only on the fact that is symmetrical and that it has a mode? I guess this would only be a difference in opinion. I don't consider mode to be unique (and neither do many statisticians), so in this case, many different distributions may refer to the problem we are discussing. In fact, any symmetrical, non-normal distribution(like a bi-modal distribution).

There are many probability laws where mean and median can have the same value, but that does not give any valid proof about the mode. In fact, since the case states that a sample is being evaluated, there may be a unique mode, even if we are talking about a uniform distribution.

I was taught the "no mode" way, and most software will not return a meaningful mode value for a uniform distribution.
I didn't get this. I really don't know any software that may guess the distribution from the data you just introduced.
 
#7
One question is what should the mode be when multiple values occur the same number of times.

Like for example if the sample is: 7,7,8,9,9

Is the mode 7, 9, undefined, or both 7 and 9?

It seems to me that the way most statisticians define the mode is that it is not unique if multiple values occur the same number of times, but that it is still defined.

http://mathworld.wolfram.com/Mode.html
http://en.wikipedia.org/wiki/Mode_(statistics)

In Excel if the function Mode is applied to the list 7,7,8,9 the result will be 7.
In Mathematica if the function Mode is applied to the list 7,7,8,9 the result will be the set {7,9}.

Taking that as the definition consider three different possibilities:

8, 8 Mean, Median, and Mode are all 8
7, 7, 8, 9, 9 Mean=8, Median=8, Mode={7,9}
1, 1, 8, 15, 15 Mean=8, Median=8, Mode={1,15}

a. The two modes are also equal to 8.

>No. The mode doesn't have to be 8.

b. The two modes are equal, but their value is not 8.

>No. The modes do not have to be equal.

c. One mode is 8, and the other is not.

>No. It is possible for neither of the modes to be 8.

d. The two modes are different from one another, and none of them is 8.

>No. It is possible that one of the modes is 8.

e. There is not enough information to answer the question.

>Correct. There is not enough information to answer the question.

Another question could be: If the mean is equal to the median is it possible to have a unique mode that is not also equal to the mean and the median?

I don't believe it is possible to have a situation like this.

David
 
#9
Also, PLEASE note that while symmetry implies mean=median=mode, it is NOT TRUE that mean=median=mode, or mean=median, imply symmetry. My example above should make this clear.

This is a very important lesson in math in general: X implies Y is not the same as Y implies X. So when you're remembering some result like "mean=median if the distribution is symmetric" be sure to check whether that's "if" or "if and only if." There's a big difference.
 
#10
Another question could be: If the mean is equal to the median is it possible to have a unique mode that is not also equal to the mean and the median?

I don't believe it is possible to have a situation like this.

David

It is possible.

Let S= {6, 6, 6, 8, 9, 10, 11}.

Mean = 8
Median = 8
Mode= 6

Let S' = { 5, 5, 5, 8, 10, 11, 12}
Mean = 8
Median = 8
Mode = 5

QED
your example shows clearly that it is possible. Even if the mean is equal to the median it is possible to have a unique mode that is not equal to the mean. I wasn't correct in thinking that it wasn't possible.

(http://spreadsheets.google.com/ccc?key=0AvBdkDCT2pvMdGZ5WDB5d2pWTGVYX25UczIzeXVrZlE&hl=en
another example which confirms this)

Also, PLEASE note that while symmetry implies mean=median=mode,
No, I don't think that this is stated correct. Symmetry doesn't mean that the mode has to be equal to the mean and the median.

the formal definitions of these terms (mean, median, and mode) will vary depending on which they are being applied to:

1. a data sample
2. a discrete distribution
3. a continuous distribution

but assuming that we are talking about a symmetric continuous distribution it is possible to have a mode that is different from the center point of symmetry.

The mode for a continuous distribution is the value of x that maximizes f(x).

there could be a distribution like:

k1 * exp[ (-(x+1)^2) / 2] for -Infinity < x <= 0
k2 * exp[ (-(x-1)^2) / 2] for 0 <= x < Infinity

k1 and k2 are constants such that this function is a valid pdf
(Integral[ f[x] dx , -Infinity, Infinity ] = 1 )

this pdf has two maximium points (one at -1 and the other at +1) these both being modes of the symmetric pdf, while the mean and median are both 0.

AtlasFrySmith, thanks for your reply, and catching my mistake. I didn't make too much progress on my other project today :eek:, but I figured out how to do different things in sage (it has the functionality there but it takes a while to get through the documentation) :)

David

p.s.



http://www.sagenb.org/home/pub/742/

(a sage worksheet which does some computations with a bimodal pdf)