+ Reply to Thread
Results 1 to 3 of 3

Thread: Reporting significance of diff between means in a table

  1. #1
    Points: 27, Level: 1
    Level completed: 54%, Points required for next Level: 23

    Location
    London
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Reporting significance of diff between means in a table



    I have about 10 variables in a data set of about 100K records. I am reporting the means for each variable by country, which is no problem.

    I also separately report means for each variable, split out for women and men, by country. This, too, is OK.

    It would be useful if I could generate a table that reports the *significance of the difference in means* between women and men, for each country - ideally in a table with countries for rows, the variables for columns, and the difference in mean, with significance, in each cell.

    So for example, if the mean happiness for men in Andorra is 1.3 and the mean happiness for women is 1.1, and that has a t-stat of 2.43, the cell would contain

    (var) Hapdif Otherdif ......
    Andorra 0.20* 1.72
    (signif) (2.43) (1.12)
    Australia 0.45 3.32*
    (signif) (0.32) (2.11)

    and so on. I can do the calculations individually, but there are about 500 of them (50 countries, 10 variables to compare), which is why I'd like to generate a table.

    I'm sure this must be asked all the time (and sorry if it is), but I can't find a reference. Can anyone get me started please, or point me in the right direction?

    Thank you.

  2. #2
    RoboStataRaptor
    Points: 7,402, Level: 57
    Level completed: 26%, Points required for next Level: 148
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,026
    Thanks
    9
    Thanked 243 Times in 236 Posts

    Re: Reporting significance of diff between means in a table

    I would probably construct this using -statsby-, for example (pretend that rep78 is country and foreign is sex):
    Code: 
    sysuse auto, clear
    set more off
    
    tempfile results // temporary file to store cumulative results
    foreach var of varlist price mpg headroom weight length turn displacement gear_ratio {
    	preserve
    	statsby, by(rep78) clear: ttest `var', by(foreign)
    	gen variable="`var'"
    	capture append using `results'
    	save `results', replace
    	restore
    }
    
    use `results', clear
    order variable // just to make it easier to read output
    Here's an alternative method which is probably a little faster:
    Code: 
    sysuse auto, clear
    set more off
    
    tempfile results // temporary file to store cumulative results
    foreach var of varlist price mpg headroom weight length turn displacement gear_ratio {
    	preserve
    	collapse (mean) mean=`var' (sd) sd=`var' (count) n=`var', by(rep78 foreign)
    	reshape wide mean sd n, i(rep78) j(foreign)
    	gen variable="`var'"
    	capture append using `results'
    	save `results', replace
    	restore
    }
    
    use `results', clear
    order variable // just to make it easier to read output
    You can then calculate the t-statistic etc yourself.

  3. #3
    Points: 27, Level: 1
    Level completed: 54%, Points required for next Level: 23

    Location
    London
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Reporting significance of diff between means in a table


    Thank you! Much appreciated. I'll get into it now.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats