Dear everyone,
I have a HH dataset consiting of about 50 or so subsets which I am trying to merge together. The following problem occures. In some of these datasets there are more observations then individuals in the survey because data on jobs has been coded as job1, job2, job3 etc... each taking a seperate row.
So in my dataset there are three ID variables and the job variable and together they form a unique ID.
Now what I need is to have the additional (job) rows transformed to variables so that I end up with exactly the number of individuals as in the other data sets. Meaning all information that is now contained in a single job row (lets say for job3) such as days worked in this job per month or hours worked in this job per day should become variables (columns).
I've drawn a rudementary graph below. Hope it will help...
Thats what I have:
# ID1 ID2 ID3 JOBID day/month hours/day
---------------------------------------------------------------------|
1 201 1 1 JOB1 # #
2 201 1 2 JOB1 # #
3 201 1 2 JOB2 # #
4 201 1 3 JOB1 # #
5 201 1 4 JOB1 # #
6 201 1 5 JOB1 # #
7 201 2 5 JOB2 # #
8 202 2 5 JOB3 # #
9 202 3 6 JOB1 # #
10 202 3 7 JOB1 # #
Should become:
# ID1 ID2 ID3 JOB1 day/month hours/day JOB2 day/month hours/day
---------------------------------------------------------------------------------
1 201 1 1 JOB1 # # JOB2 # #
2 201 1 2 JOB1 # # JOB2 # #
4 201 1 3 JOB1 # # JOB2 # #
5 201 1 4 JOB1 # # JOB2 # #
6 201 1 5 JOB1 # # JOB2 # #
9 202 3 6 JOB1 # # JOB2 # #
10 202 3 7 JOB1 # # JOB2 # #
Hope that makes sense at all.
Btw I tried using reshape but I aways get a reshape error saying that some variables are not constant within the id variables I have described.
Well thank you ever so much to whoever can help me sort that out (and of course whoever gives it a shot) you really made my day.
Best, Florian
I have a HH dataset consiting of about 50 or so subsets which I am trying to merge together. The following problem occures. In some of these datasets there are more observations then individuals in the survey because data on jobs has been coded as job1, job2, job3 etc... each taking a seperate row.
So in my dataset there are three ID variables and the job variable and together they form a unique ID.
Now what I need is to have the additional (job) rows transformed to variables so that I end up with exactly the number of individuals as in the other data sets. Meaning all information that is now contained in a single job row (lets say for job3) such as days worked in this job per month or hours worked in this job per day should become variables (columns).
I've drawn a rudementary graph below. Hope it will help...
Thats what I have:
# ID1 ID2 ID3 JOBID day/month hours/day
---------------------------------------------------------------------|
1 201 1 1 JOB1 # #
2 201 1 2 JOB1 # #
3 201 1 2 JOB2 # #
4 201 1 3 JOB1 # #
5 201 1 4 JOB1 # #
6 201 1 5 JOB1 # #
7 201 2 5 JOB2 # #
8 202 2 5 JOB3 # #
9 202 3 6 JOB1 # #
10 202 3 7 JOB1 # #
Should become:
# ID1 ID2 ID3 JOB1 day/month hours/day JOB2 day/month hours/day
---------------------------------------------------------------------------------
1 201 1 1 JOB1 # # JOB2 # #
2 201 1 2 JOB1 # # JOB2 # #
4 201 1 3 JOB1 # # JOB2 # #
5 201 1 4 JOB1 # # JOB2 # #
6 201 1 5 JOB1 # # JOB2 # #
9 202 3 6 JOB1 # # JOB2 # #
10 202 3 7 JOB1 # # JOB2 # #
Hope that makes sense at all.
Btw I tried using reshape but I aways get a reshape error saying that some variables are not constant within the id variables I have described.
Well thank you ever so much to whoever can help me sort that out (and of course whoever gives it a shot) you really made my day.
Best, Florian