I'll try to explain what my problem is.
I am cleaning data and I have one question where multiple ticks are allowed.
The questions I am dealing now is about way in which water is purified.
There are 6 variables and 1 of these is "I don't do anything to purify the water", the other 6 being " I do purify water in this way...".
Since there is a higher number of people who do not purify the water and this is relevant information, I would like to create one variable where 1 stands for "I do not purify the water" and 2 " I do purify the water".
However, since multiple ticks were allowed, If i simply do
gen x = .
replace x = 1 if (pur_a == 1 | pur_b == 2 | pur_c == 3 )
The frequency that I get is actually lower than the number of answers. So basically what Stata does is assigning only ine answer per person and not more than one. This is why I get a lower frequency.
Is there a way to work this out and get the actual frequency?
I hope I made the point clear...it is a bit of a headache indeed! :yup:
hope to hear from you guys!
belfagor
I am cleaning data and I have one question where multiple ticks are allowed.
The questions I am dealing now is about way in which water is purified.
There are 6 variables and 1 of these is "I don't do anything to purify the water", the other 6 being " I do purify water in this way...".
Since there is a higher number of people who do not purify the water and this is relevant information, I would like to create one variable where 1 stands for "I do not purify the water" and 2 " I do purify the water".
However, since multiple ticks were allowed, If i simply do
gen x = .
replace x = 1 if (pur_a == 1 | pur_b == 2 | pur_c == 3 )
The frequency that I get is actually lower than the number of answers. So basically what Stata does is assigning only ine answer per person and not more than one. This is why I get a lower frequency.
Is there a way to work this out and get the actual frequency?
I hope I made the point clear...it is a bit of a headache indeed! :yup:
hope to hear from you guys!
belfagor