Deleting Identical Observations

#1
Hello All,

I have a data set containing how much each country exports to other countries. For example, I have how much Algeria exports to Venezuela, Algeria to Congo .... for every country pair. But, I want to find TOTAL exports for each country pair. For example, I want to have one observation which is the sum of Algeria's exports to Venezuela and Venezuelas exports to Argentina. Basically, I want to cut my observations in half.

I need to somehow make a country pair code for each pair of countries and drop the duplicates. Does anybody have any idea how I can go about this? Thank you very much for the help,

Nick
 

bukharin

RoboStataRaptor
#2
Here's one way to do it. Assume your variables are called -exporter-, -importer- and -amount-. We'll use fictional data for countries A, B, C and D. First create new country variables called -country1- and -country2- which are the same as -exporter- and -importer- but in alphabetical order. This means that every country pair will have the same combination of -country1- and -country2-. Then, calculate the total of -amount- for each country pair using -egen-. Finally, drop the old variables and delete duplicates.

Code:
. input str1 exporter str1 importer amount

      exporter   importer     amount
  1. "A" "B" 50
  2. "A" "C" 20
  3. "A" "D" 100
  4. "B" "A" 100
  5. "B" "C" 60
  6. "B" "D" 80
  7. "C" "A" 15
  8. "C" "B" 40
  9. "C" "D" 90
 10. "D" "A" 75
 11. "D" "B" 100
 12. "D" "C" 65
 13. end

. list, clean noobs

    exporter   importer   amount  
           A          B       50  
           A          C       20  
           A          D      100  
           B          A      100  
           B          C       60  
           B          D       80  
           C          A       15  
           C          B       40  
           C          D       90  
           D          A       75  
           D          B      100  
           D          C       65  

. gen country1=exporter 

. replace country1=importer if importer<exporter 
(6 real changes made)

. gen country2=exporter 

. replace country2=importer if importer>exporter
(6 real changes made)

. list, clean noobs

    exporter   importer   amount   country1   country2  
           A          B       50          A          B  
           A          C       20          A          C  
           A          D      100          A          D  
           B          A      100          A          B  
           B          C       60          B          C  
           B          D       80          B          D  
           C          A       15          A          C  
           C          B       40          B          C  
           C          D       90          C          D  
           D          A       75          A          D  
           D          B      100          B          D  
           D          C       65          C          D  

. egen trade=total(amount), by(country1 country2)

. list, clean noobs

    exporter   importer   amount   country1   country2   trade  
           A          B       50          A          B     150  
           A          C       20          A          C      35  
           A          D      100          A          D     175  
           B          A      100          A          B     150  
           B          C       60          B          C     100  
           B          D       80          B          D     180  
           C          A       15          A          C      35  
           C          B       40          B          C     100  
           C          D       90          C          D     155  
           D          A       75          A          D     175  
           D          B      100          B          D     180  
           D          C       65          C          D     155  

. drop exporter importer amount

. duplicates drop

Duplicates in terms of all variables

(6 observations deleted)

. list, clean noobs

    country1   country2   trade  
           A          B     150  
           A          C      35  
           A          D     175  
           B          C     100  
           B          D     180  
           C          D     155