View Full Version : skewness issue
I'm trying to find the skewness for this distribution using SPSS:
Bin..........Freq
0-10........78
10-20......2589
20-30......922
30-40......11
So, all I know is how many values are in each bin (not what the values are themselves). Does that mean I have to type in the midpoint value of the 2nd bin 2589 times in order for SPSS to understand what I'm looking for? Anyone know of a shorter way? Thanks.
JohnM
02-11-2006, 11:14 PM
I don't think it can be done with "grouped" data in SPSS, but here's a way that will be pretty close (you could do this in Excel):
skewness = m3 / [ m2 * sqrt(m2) ]
where m2 and m3 are "moments" of a distribution
m2 = summation of [ (x - mean)^2 / n ]
m3 = summation of [ (x - mean)^3 / n ]
the x's would be the bin mid-points (5,15,25,35), the mean would be the mean of the grouped data (summation of (x * p(x)), and n would be the sum of the frequencies
for the mean, you should get 17.4 and n = 3600
Hope this helps.
Hi John, thanks for the reply. I tried that method and it worked well (I had never even heard of moments before). I have another distribution that has 12 bins. When I calculated the skewness for it using that method, I got a crazy large number that doesn't seem correct. For 12 bins, would I use the same method as you outlined, or do the 'moment' equations change?
JohnM
02-14-2006, 07:58 PM
It depends on how badly skewed the data is - I don't think the skewness index has an upper or lower limit.
Go ahead and post the 12-bin data set and I'll take a look. If you got a very high number then it should be obvious from looking at the frequency distribution.
Moments come from mechanics and physocal masses or bodies - the first moment is associated with the center of gravity, which is the mean of a probability distribution. The second moment is associated with gyration, which is analogous to variation. Third moment is associated with skewness, and the fourth moment is associated with kurtosis.
Ok, I'm going to try it again by hand to see what kind of answer I get. I might've screwed up with the negatives.
Here's the distribution:
Midpt. Freq
2.75.......216
2.25.......178
1.75.......383
1.25.......542
0.75.......682
0.25.......646
-0.25.....508
-0.75.....226
-1.25.....116
-1.75......39
-2.25......17
-2.75......47
JohnM
02-14-2006, 09:52 PM
You should get -16.79.
Yep, that's what I got when I did it manually. However, when I typed in all the values into Excel and used the skew function, it gave a much much smaller negative number.
I've found this equation for skewness too that would be interesting to try and compare:
Skewness = M3 / M2^3/2
where Mn = ((x - {x})^n)
where {x} is the expected number.
Would the 'expected number' be the same as the mean?
JohnM
02-16-2006, 02:03 PM
Yes, the "expected number" is the mean.
When you say you typed in all the numbers - the individual data points or the bin frequencies?
I typed in all the individual data points (i.e. 216 cells with "2.75" and 178 cells with "2.25", etc.) The skewness it reported was -0.26.
JohnM
02-16-2006, 03:30 PM
Well, actually what you did was put in the midpoints of the bins and "weight" them. Excel uses a completely different formula for skewness, so I'm not surprised by the difference.
Powered by vBulletin™ Version 4.1.3 Copyright © 2013 vBulletin Solutions, Inc. All rights reserved.