slicing and dicing a large dataset

I'm new to SAS (and programming in general) so I was hoping someone could help me out.

I have a ridiculously large data set I am analyzing, and I wanted to pare it down a bit... so I know what I need to do, just not how.

First, there is a subject ID variable. It is in one of the following formats:


I am only interested in the subjects with 8 digit codes, so how could I create a dataset that drops the other two types?

Second, there are a string of variables for a mental test, I only want subjects that actually took the test on all administrations of it, so is there some method to keep records that have data in all of the variables and drop the rest?

thanks for any help!


Dark Knight
1. I guess subjectid is a character variable

use substr function
like using condition like
if/where SUBSTR( subjectid, 9,1 ) ne ''

or use length function

2. how many string variables you have? one way is you can write the condition separately ( var1 ne '' and var2 ne '' ...etc)
Or if all your string variables are of same format ... you can use array and simply the conditions.