+ Reply to Thread
Results 1 to 2 of 2

Thread: Word list search

  1. #1
    Points: 332, Level: 6
    Level completed: 64%, Points required for next Level: 18

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Word list search



    Hi all,

    i am a SAS beginner and my issue is the following: I have a long list of words and associated values (e.g., frequency in spoken language) in one dataset (Word_List) and another even longer dataset consisting of phrases(Phrase_List). I have to check whether each of the words from Word_List appears in each of the phrases in Phrase_List. If found, then the phrase is assigned the value of the word found.

    So far, I have only managed to do this for single words in the Phrase_List dataset using INDEXW. However, i would need tips on how to do this automatically from one dataset to the other.

    My code for single word search:

    data ref.phrase_score;

    set ref.phrase_list;

    score=0;

    found=indexw(phrase,"able");

    if found=0 then delete;

    else score+3.56;

    keep phrase_ID phrase found score;

    run;

    I would really appreciate tips on how to do this automatically for all words from the list and using these two separate files.

    Thanks,

    Dan

  2. #2
    TS Contributor
    Points: 6,955, Level: 55
    Level completed: 3%, Points required for next Level: 195

    Posts
    784
    Thanks
    0
    Thanked 71 Times in 70 Posts

    Re: Word list search


    I use macros to repetitively do something (thousands of times).

    Code: 
    data _null_;
    set Word_List(keep=word value);
    n=_N_;
    call symput("word"||left(n),word);
    call symput("value"||left(n),value)
    call symput("NUMWORDS",n);
    run;
    
    *Check out the macro variables you just defined:;
    %put _user_;
    
    *In the data step we will want to do many IF statements, so set up a macro that will cycle through all the macro variables;
    %macro wordlistcheck;
    %let n=1;
    %do %while (%eval(&n)<=&NUMWORDS);
    if index(phrase,"&&word&n") then score+&&value&n;
    %let n=%eval(&n+1);
    %end;
    %mend wordlistcheck;
    
    *Put the above macro into your data step;
    data phrase_score;
    set phrase_list;
    score=0;
    %wordlistcheck;
    if score=0 then delete;
    run;

+ Reply to Thread

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats