I have a messy file (does not line up in columns), where some of the columns are tab delimitated and some are comma.
My problem with the data set is reading the files with variable lengths
12 Stephen Cole, 33, Columbia, MO
5 Dave Anderson, 25*, Concord, OH
The first column is a ID (tab) the the name (comma, variable length) age (comma), active (presence of an asterisk after age), home (tab, variable length)
The * after the age indicates if they are inactive.
All the names start at column @19, but everything after that is variable lengths and column starts.
I want to read into a format where I finally get.
ID Name Age Active Home
12 Stephen Cole 33 Active Columbia, MO
5 Dave Anderson 25 Inactive Concord, OH
Thus far I have:
data marathon;
infile 'c:/file.txt' dlm=',' pad firstobs=12;
input @3 ID 3. @19 Name $CHAR13.;
Then I get stuck on how to read the rest. I am mostly thrown with how to read the asterisk next to the age as its own column. If I had that understood, I think I can handle the rest.
My problem with the data set is reading the files with variable lengths
12 Stephen Cole, 33, Columbia, MO
5 Dave Anderson, 25*, Concord, OH
The first column is a ID (tab) the the name (comma, variable length) age (comma), active (presence of an asterisk after age), home (tab, variable length)
The * after the age indicates if they are inactive.
All the names start at column @19, but everything after that is variable lengths and column starts.
I want to read into a format where I finally get.
ID Name Age Active Home
12 Stephen Cole 33 Active Columbia, MO
5 Dave Anderson 25 Inactive Concord, OH
Thus far I have:
data marathon;
infile 'c:/file.txt' dlm=',' pad firstobs=12;
input @3 ID 3. @19 Name $CHAR13.;
Then I get stuck on how to read the rest. I am mostly thrown with how to read the asterisk next to the age as its own column. If I had that understood, I think I can handle the rest.