Array problem

AKG

New Member
#1
Hello all! :wave: Thanks for any help you might have.

I've got a dataset with repeated measures for each medical ICD-9 code, and I need to make it such that the data is one row per person.

For example (this isn't actually my data):

What I have now:
Kelly 745.5
Kelly 747.0
Juanita 745.4
Lemar 744.2
Lemar 747
Lemar 755.02

What I want:
Kelly 745.5 747.0
Juanita 745.4
Lemar 744.2 747.0 755.02

I've written code for the array, but what I'm getting from that code is:
Kelly 745.5 747.0
Juanita 745.5 747.0 745.4
Lemar 745.5 747.0 745.4 744.2 747.0 755.02


The variables are dxicd9 (for the ICD-9 codes) and the max # of diagnoses per case is 116.

Here's my incorrect code:

proc import datafile='data.xls'
out=work.name
dbms=xls
replace;
getnames=yes;
run;

proc sort data=work.name;
by Person_ID;
run;

data work.singlerecord;
set work.name;
array dx_array(116) dxicd9_1-dxicd9_116;
do i = 1 to 116;
set work.singlerecord;
by person_id;
dx_array{i} = dxicd9;
if last.person_id then output;
end;
run;


It's making me a little crazy! I have NO IDEA how to even Google the problem that I'm having.

I wonder if anyone has any idea what's happening or how to fix it.

Thanks.

Amanda in Kentucky
 

AKG

New Member
#2
I forgot to mention that I got this version by tweaking my original, which gave me all missings for dxicd9 codes, which read like this:

data kbsr.singlerecord9803;
set kbsr.check;
by person_id;
retain dxicd9_1-dxicd9_116;
array dx_array(1:116) dxicd9_1-dxicd9_116;
if first.person_id then do;
do i = 1 to 116;
dx_array{i} = .;
end;
end;
dx_array = dxicd9;
if last.person_id then output;
run;
 

AKG

New Member
#3
and this code only gives me the first diagnosis in dxicd9_1 with missings for dxicd9_2-dxicd9_116!

Grrr.

data work.dataset;
set work.dataset1;
by person_id;
retain dxicd9_1-dxicd9_116;
array dxarray {1:116} dxicd9_1-dxicd9_116;
if first.person_id then do j = 1 to 116;
dxarray {j} = .;
if first.person_id then i = 0;
i + 1;
if dxicd9 ^= . then do;
dxarray {i} = dxicd9;
end;
end;
if last.person_id then output;


Does anyone have any idea what's up?

Thanks for any and all help!
 

SPR

New Member
#4
Hello Amanda,

If I understand you correctly you need the proc TRANSPOSE:
Code:
data i;
  input name $ ICD_9;
datalines;
Kelly 745.5
Kelly 747.0
Juanita 745.4
Lemar 744.2
Lemar 747
Lemar 755.02
run;
proc sort data=i;
  by name;
run;
proc transpose data=i out=r prefix=ICD_9_;
  var ICD_9;
  by name;
run;
Sincerely,
SPR
 

AKG

New Member
#5
Thank you SPR! It worked.

However, all the rest of my variables have been dropped from the data besides the person_id and the diagnoses... While today I'll just merge the diagnoses into the original rows & drop all but the first obs for each person, there is probably an easier way to retain the extra variables in the transpose, no? I don't want to list them in a keep= statement because there are 300+ variables.

(Thank you again! With 300+ vars and 65,000+ unique person_ids, your answer is a biiiig help.)

I'm going to try to figure it out how to keep the vars-- never transposed before so maybe it's no problem.

Anyway, thanks so much!!!

Amanda
 

SPR

New Member
#6
"While today I'll just merge the diagnoses into the original rows & drop all but the first obs for each person, there is probably an easier way to retain the extra variables in the transpose, no?"

Yes, it is possible solution.

SPR
 
#8
Cdps sas code from ucsd?

Is anyone working with this code that can answer some questions about the order of the code and why the code may run but no output data sets or errors? There is only the log giving a duplicate of the code and no errors