I am working on a small research project that involves a list of articles (a few 10,000) that have keywords assign to them.
I would like to know if Stata can do the following:
Search for keywords (1-20) through the list of all articles and identify those articles that have more than n-number relevant keywords?
Article1 - keyword1 - keyword2- keywords3-keyword4-keyword5
Article2- keyword1 - keyword2-keyword4
Article3 - keywords3-keyword4-keyword5
article4 - keyword1 -keyword5
Then relevant keywords would be: keyword1 - keywords3-keyword5
And the result should be that there are at least 2 out of three relative keywords, thus Stata says:
Article1, article3, and article4
Can this happen? can Stata work with ‘words' like this?
Can it create new list with this type of demanded criteria?
Thank you for all your help.
Advertise on Talk Stats