Stata & 'word' variables


New Member

I am working on a small research project that involves a list of articles (a few 10,000) that have keywords assign to them.
I would like to know if Stata can do the following:
Search for keywords (1-20) through the list of all articles and identify those articles that have more than n-number relevant keywords?


Article1 - keyword1 - keyword2- keywords3-keyword4-keyword5
Article2- keyword1 - keyword2-keyword4
Article3 - keywords3-keyword4-keyword5
article4 - keyword1 -keyword5
article5 -keyword2-keyword5

Then relevant keywords would be: keyword1 - keywords3-keyword5
And the result should be that there are at least 2 out of three relative keywords, thus Stata says:
Article1, article3, and article4

Can this happen? can Stata work with ‘words' like this?
Can it create new list with this type of demanded criteria?

Thank you for all your help.