# Merge two staggered columns to one

#### ledzep

##### Point Mass at Zero
Dear all,

I have two columns of data. Some info is on one column and some on the other. I want to merge those two columns to make a new one.

An example data will make things more clear.
Code:
col1  col2
1      NA
1      NA
1       NA
NA     5
NA     5
NA     5
2       NA
2       NA
2       NA
NA    9
NA    9
I want to merge two columns to a new column and the end result will look like:

Code:
col1  col2   col3
1      NA     1
1      NA     1
1       NA    1
NA     5      5
NA     5      5
NA     5      5
2       NA    2
2       NA    2
2       NA    2
NA    9       9
NA    9       9

For some reason if(statement) {statement1} else {statment2} is not working.

I have requested R to perform this task:
if col1 is NA then please create col3 based on the values of col2, else based on Col1 itself.

My Code
Code:
if(d.f$col1=="NA") {d.f$col3<-(d.f$col2)} else {d.f$col3<-(d.f$col1)} Many Thanks. #### Dason ##### Ambassador to the humans Is the data always numeric? If so you could use rowsum with na.rm = T to create the new column. BTW it's better practice to use "is.na" then to compare to "NA". #### ledzep ##### Point Mass at Zero Thanks for your prompt reply. The data is always numeric. rowsum won't work for me as I have formula inside the statements in my if else. I am not looking to add them (although by the example I've presented makes it look like I want to add these two columns). I also tried converting NA to 99999 and then apply the if else argument, which didn't work. Any clue? #### Link ##### Ninja say what!?! are the columns just vectors, or are they columns in a dataframe or matrix? One option is to just take the max (or min) over the two columns: Code: col3 <- pmax(col1, col2, na.rm=TRUE); OR col3 <- pmin(col1, col2, na.rm=TRUE) #### Dason ##### Ambassador to the humans ... but why do you need the ifelse? Are there cases where there are values in both columns? Then I could understand the concern. But if that's not the case then something like this should do just fine... Code: > dat <- data.frame(col1 = c(1,2,3,NA,NA,NA,2,4,2), col2 = c(NA,NA,NA,6,2,8,NA,NA,NA)) > dat$col3 <- rowSums(dat, na.rm = T)
> dat
col1 col2 col3
1    1   NA    1
2    2   NA    2
3    3   NA    3
4   NA    6    6
5   NA    2    2
6   NA    8    8
7    2   NA    2
8    4   NA    4
9    2   NA    2

#### ledzep

##### Point Mass at Zero
Exactly, there are some cases which have values in both of those columns.

#### Dason

##### Ambassador to the humans
There are a few reasons that your if/else code doesn't work. You should really think about these types of problems as indexing problems instead of trying to cobble together an if/else type situation.

Code:
> dat <- data.frame(col1 = c(1,2,3,NA,NA,NA,2,4,2), col2 = c(NA,NA,NA,6,2,8,NA,NA,1))
> dat$col3 <- dat$col1
> idx <- is.na(dat$col1) > dat$col3[idx] <- dat\$col2[idx]
> dat
col1 col2 col3
1    1   NA    1
2    2   NA    2
3    3   NA    3
4   NA    6    6
5   NA    2    2
6   NA    8    8
7    2   NA    2
8    4   NA    4
9    2    1    2
On the topic of why your if/else doesn't work: 1) Comparing NA directly to "NA" isn't going to do what you want and 2) Does it make sense to say if(c(TRUE, FALSE, TRUE)){ ... }? Not really - it ignores everything except for the first value of the vector in the if() statement.

##### Ninja say what!?!
Here's a "clean" redo of your ifelse coding:

Code:
col3 <- col1
col3[is.na(col3)] <- col2[is.na(col3)]
I haven't tested it yet but it should work.

#### Dason

##### Ambassador to the humans
The ninja got ninja'd hard by an essentially identical solution.

#### ledzep

##### Point Mass at Zero
Thanks Link. That was a very neat code. You just saved my life here.
Many Thanks to both of you (Dason and Link).