# rearranging content in variable

##### New Member
hi,
I'm new at working with STATA. Currently, I face the following problem: I have given data that are organized like that:
var1 cat1 cat2
1.2 0 1
5 0 1
...
7 1 0
2.3 1 0
...

ok so I think the logic behind is easy to grasp. Now I want to subtract those values of var1 when cat1==1 from the var1 when cat2==1.
So I thought I can just generate two new variables, where I filter with an if clause, but this doesn't work, because then the content of the variables looks like that

gen test1=var1 if cat1==1
gen test2=var1 if cat2==1

test1:
.
.
...
7
2.3
...

and test2 is the complete opposite:
1.2
5
...
.
.
...

I tried with drop deleting the missing values, but then suddenly, both variables were empty. How can I subtract them properly? My overall goal is to calculate the percentage change and the standard errors of the difference. Any suggestions?

#### RedOwl

##### New Member
I don't believe I understand what you want, but I'll give it a try.

You don't need two variables cat1 and cat2, because one is
determined by the other. If cat1 equals 0, then cat2 must equal 1,
and vice versa. So just use a single cat variable with values 1 or 2
(or 0 and 1 if you prefer)

Code:
* Build toy data set with single cat variable 1/2.
clear all
input var1 cat
1.2 2
5.0 2
7.0 1
2.3 1
end

list, noobs

egen totcat = total(var1), by(cat)
summarize totcat, meanonly
gen diff = r(max) - r(min)

list, noobs

##### New Member
No, that was not what I mean. The data is given to me, I can't change anything, I just have to deal with it. And the author made it that way so. But this was also not my question. I said I wanted to calculate the mean and the standard error of the difference of var1, so: (var1 if cat1==1) - (var1 if cat2==1)

#### RedOwl

##### New Member
OK, that's a poor data structure, but this may work. It just adds
some code to create a single category variable before proceeding.

If this is not what you want, I don't understand your question.

Code:
* Build toy data set.
clear all
input var1 cat1 cat2
1.2 0 1
5.0 0 1
7.0 1 0
2.3 1 0
end

list, noobs

* Create single category variable from cat1 and cat2
gen category = 1 if cat1
replace category = 2 if cat2

list, noobs

egen totcat = total(var1), by(category)
summarize totcat, meanonly
gen diff = r(max) - r(min)

list, noobs