avoiding double counting from merged datasets

#1
Hi there,

I have merged two client datasets. The first dataset contains demographic information about the individuals, the second contains those same individuals stock purchases over time.

When I merge them using the unique person identifier number I obviously have several rows now for each client (i.e. buying a specific stock or adding to a position for example)

When I do simple cross tabs I am counting the same individuals more than once. e.g. tab sex stockpurchase or tab economicstatus sex

What is the best way to avoid this double counting?

thanks for any help on this

jamieb
 

bukharin

RoboStataRaptor
#2
There are various techniques. One simple one is to "tag" each person once:
Code:
egen uniq=tag(personid)
tab sex stockpurchase if uniq