merging files in a challenging situation. *ideas?*

#1
:) hi all!
I'm working on my master thesis and I think I need some expert advice here :)
what I need to do is basically match two data files.
from what I understand normally you need to have either exactly the same variables or exactly the same cases to merge files. unluckily this is not my case.
the databases are from two years (2006 and 2007), contain different variables and share *some* of the cases. this is because some of the firms to which the questionnaire was sent replied both years, but some only replied either in 2006 or in 2007.
I now need to merge the data from the two years in order to observe what happened in 2007 to a certain firm who had certain values in its 2006 variables.
the problem is that the cases from year 2006 and 2007 only ovelap for about 1/3.
is there any way to merge the files?
I would love to hear your ideas :eek:
 
#2
Dear Diana

only in an ideal world, you would always have
either exactly the same variables or exactly the same cases to merge files.
;)

Do I underrstand correctly: In dataset 1, you have the cases/firms A, B, C, D, E, F, G - and variables var-1-2006, var-2-2006, var-3-2006 and var-4-2006?
In dataset 2, you have the cases/firms D, E, F G, H, I, J - and variables var-1-2007, var-2-2007, var-3-2007 and var-4-2007?

I guess it should not be so difficult. If you tell me what software are you using, I may be able to tell you more.

Have a nice sunday
S.
 
#3
Dear Diana

only in an ideal world, you would always have ;)

Do I underrstand correctly: In dataset 1, you have the cases/firms A, B, C, D, E, F, G - and variables var-1-2006, var-2-2006, var-3-2006 and var-4-2006?
In dataset 2, you have the cases/firms D, E, F G, H, I, J - and variables var-1-2007, var-2-2007, var-3-2007 and var-4-2007?

I guess it should not be so difficult. If you tell me what software are you using, I may be able to tell you more.

Have a nice sunday
S.
exactly! this is precisely my situation. I am using SPSS 17.
thanks a lot for quick reply and have a nice sunday ^_^

diana
 
#4
mh. just to be precise: in dataset 1 I have var12006, var22006, var32006 and var42006. but in dataset 2 I have var52007, var62007, var72007 etc. unfortunately most of the items in the questionnaire changed over the years :/
 
#5
Technically it is straightforward: Just use the ordinary MATCH command to combine the datasets. Match by firm or rather, id of the firm. SPSS should create a file with all firms A to G as cases/rows that contains all variables from both files. Those firms that appear in both original files will have entries on all variables; those that are only in dataset 1 will have system missings on all variables from dataset 2 and vice versa.

Depending on your research question, and the sampling schemes, you might want to leave those out of all anlyses, but in any case, rather not delete them, because you might want some information on the cases you had to leave out and whether they differ from those you could keep.

If you are not using the commands but the menu bar, search the menu "data" or similar - I have only German SPSS so I can't tell you where to find it. If you find a says something like "add variables", it should be right.

unfortunately most of the items in the questionnaire changed over the years :/
Hope some remained the same or at least comparable, so your plan will work out!
 
Last edited: