Raw Data vs Frequency Distribution

Anum

New Member
#1
Hi,

I am new here, please discuss this given below.

“Why an average computed from a frequency distribution is not exactly the same as computed from the raw data? Give the reason”
 

maartenbuis

TS Contributor
#4
You can compute all three without bias, assuming the frequency distribution is really the frequency distribution and not the frequency distribution of a binned version of the original data. So the underlying assumption behind the original question seems to be false or you are not telling us everything we need to know.
 

Anum

New Member
#5
You can compute all three without bias, assuming the frequency distribution is really the frequency distribution and not the frequency distribution of a binned version of the original data. So the underlying assumption behind the original question seems to be false or you are not telling us everything we need to know.
Actually this is the puzzle given by instructor. I don't know how should I answer this. I copy past same to same question of instructor, now if you can help me in this regard then please help me. Thanks.
 

hlsmith

Omega Contributor
#6
I agree with Maartenbius. I was thinking if you did not have raw data and used weights from frequency dist, then bins could slightly err the Cal ulation.
 

maartenbuis

TS Contributor
#7
Actually this is the puzzle given by instructor. I don't know how should I answer this. I copy past same to same question of instructor.
Either your instructor is just wrong, or this question is part of a larger exercise which gives the relevant information that you are not telling us. Context is important!!!!!

If this is an isolated question, then I would just give a counter-example, for example:

Code:
. // open some example data:
. sysuse auto
(1978 Automobile Data)

. 
. // compute mean and median using raw data
. sum rep78, detail

                     Repair Record 1978
-------------------------------------------------------------
      Percentiles      Smallest
 1%            1              1
 5%            2              1
10%            2              2       Obs                  69
25%            3              2       Sum of Wgt.          69

50%            3                      Mean           3.405797
                        Largest       Std. Dev.      .9899323
75%            4              5
90%            5              5       Variance       .9799659
95%            5              5       Skewness      -.0570331
99%            5              5       Kurtosis       2.678086

. 
. // using the frequency distribution
. tab rep78

     Repair |
Record 1978 |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |          2        2.90        2.90
          2 |          8       11.59       14.49
          3 |         30       43.48       57.97
          4 |         18       26.09       84.06
          5 |         11       15.94      100.00
------------+-----------------------------------
      Total |         69      100.00

. 
. // the cumulative percentage passes 50 at
. // rep78=3, so the median is 3
. 
. // the mean is:
. di (2*1+8*2+30*3+18*4+11*5)/69
3.4057971