Overlaying two histograms/lines?

#1
Hi all!


I am going nuts in trying to do a graph which to me looks like the basic graph ever.
Basically I have two variables: the score that the children got in T1 and the score that the children got in T2.
I simply want the frequency distribution of these 2 variables to be on the same graph: this seems too easy to me but for some reason I cant manage to find the answer on internet! Most of the solutions I found are for one variable but I have two.

I hope you can help me...
thanks!
 
#2
Perhaps the following adapted from http://www.survey-design.com.au/Stata%20Graphs.html (with minor modifications) will help.

*** BEGIN STATA CODE ***

sysuse auto, clear
* Following must be on a single line:
twoway (histogram mpg if !foreign, start(10) width(2) freq bfcolor(none) blcolor(brown)) (histogram mpg if foreign, freq start(10) width(2) barw(1.8) bfcolor(none) blcolor(navy)), legend(order(1 "Domestic" 2 "Foreign") pos(2) ring(0) col(1)) scheme(s1color)

*** END STATA CODE ***
 
#3
If you want to superimpose normal curves on the two histograms, it becomes a little more complicated. Althought Stata can easily overlay a normal distribution over a free-standing histogram with the norm option, that option is not supported for overlayed histograms. Instead, we have to use function plots with normal density arguments.

Here's an example of some further modified code to do that.

*** BEGIN STATA CODE ***

sysuse auto

quietly summarize mpg if foreign
local var1mean: display %6.2f r(mean)
local var1sd: display %6.2f r(sd)
local var1min: display %6.2f r(min)
local var1max: display %6.2f r(max)
quietly summarize mpg if !foreign
local var2mean : display %6.2f r(mean)
local var2sd: display %6.2f r(sd)
local var2min: display %6.2f r(min)
local var2max: display %6.2f r(max)

twoway (histogram mpg if !foreign, start(10) width(2) fraction bfcolor(none) blcolor(brown)) (histogram mpg if foreign, fraction start(10) width(2) barw(1.8) bfcolor(none) blcolor(navy)) (function y=normalden(x, `var1mean', `var1sd') , range(`var1min' `var1max') lcolor(navy)) (function y=normalden(x, `var2mean', `var2sd') , range(`var2min' `var2max') lcolor(brown)) , xlabel(10(5)45, labsize(small)) ylabel(, angle(horizontal) labsize(vsmall) format(%3.2f)) legend(order(1 "Domestic" 2 "Foreign") pos(2) ring(0) col(1)) note(" " "MPG Domestic: {it:M} = `var2mean' {it:SD} = `var2sd'" "MPG Foreign: {it:M} = `var1mean' {it:SD} = `var1sd'", size(vsmall)) scheme(s1color)

*** END STATA CODE ***


If you just want the normal density lines overlayed without the histograms, you can use the following (after loading the data and creating the locals as above).

*** BEGIN STATA CODE ***

twoway (function y=normalden(x, `var2mean', `var2sd') , range(`var2min' `var2max') lcolor(navy)) (function y=normalden(x, `var1mean', `var1sd') , range(`var1min' `var1max') lcolor(brown)) , xlabel(10(5)45, labsize(small)) ylabel(, angle(horizontal) labsize(vsmall) format(%3.2f)) legend(order(1 "Domestic" 2 "Foreign") pos(2) ring(0) col(1)) note(" " "MPG Domestic: {it:M} = `var2mean' {it:SD} = `var2sd'" "MPG Foreign: {it:M} = `var1mean' {it:SD} = `var1sd'", size(vsmall)) ytitle("") xtitle("MPG") scheme(s1color)

*** END STATA CODE ***
 
#4
Thanks sooo much! it works perfectly and you can superimpose even more than 2 histograms/normal density lines!!!
Thank you so much, this is so coooool!
 

bukharin

RoboStataRaptor
#5
An alternative (and simpler) approach is to use a kernel density estimator, for example:
Code:
sysuse auto, clear
twoway kdensity mpg if !foreign || kdensity mpg if foreign
 
#6
Hi Bukharin,

Is there a way to have bars instead of lines. Something like a regular histogram. I have a similar dataset. Pre and post scores but I also have a control and experimental group. I hope you can help me with this.
thank you!
Marvin