I have a dataset of articles found for a literature review. I have their TITLE, YEAR of PUBLICATION, AUTHOR, and other variables. I know that there are duplicates in my dataset. I would like to delete duplicates based on TITLE, YEAR, AUTHOR.
MY title is a string variable and its a little bit messy. Perhaps the same article can have slightly different titles EX> Breast cancer and pregnancy... or breast cancer and Pregnancy (with an space at the begging). Same thing with my variable author author.
My variable YEAR is numeric so that's fine.
How can I find my duplicates and delete them (the duplicates).
Please I really need help on this one!
thank you in advance... I am attaching a subset of my dataset. Thank you
Marvin
MY title is a string variable and its a little bit messy. Perhaps the same article can have slightly different titles EX> Breast cancer and pregnancy... or breast cancer and Pregnancy (with an space at the begging). Same thing with my variable author author.
My variable YEAR is numeric so that's fine.
How can I find my duplicates and delete them (the duplicates).
Please I really need help on this one!
thank you in advance... I am attaching a subset of my dataset. Thank you
Marvin