# iterative data subsetting

#### cruzeconomics

##### New Member
I have 15 data sets that I want to take specific observations from and create a new data structure with them. Rather than doing all of this by hand, I am trying to make a function that will do it for me. This is what I have thus far:

Code:
function(){

coding <- 7

for (i in 7:99){

temp84<-subset(fars84, MAKE == coding) # Here I take the original data
temp85<-subset(fars85, MAKE == coding) # structure (farsxx) and grab the
temp86<-subset(fars86, MAKE == coding) # observations of interest.
temp87<-subset(fars87, MAKE == coding)
temp88<-subset(fars88, MAKE == coding)
temp89<-subset(fars89, MAKE == coding)
temp90<-subset(fars90, MAKE == coding)
temp91<-subset(fars91, MAKE == coding)
temp92<-subset(fars92, MAKE == coding)
temp93<-subset(fars93, MAKE == coding)
temp94<-subset(fars94, MAKE == coding)
temp95<-subset(fars95, MAKE == coding)
temp96<-subset(fars96, MAKE == coding)

ttemp84<-temp84[vars] #this is just formatting because each fars data
ttemp85<-temp85[vars] #structure has different variable dimensions
ttemp86<-temp86[vars]
ttemp87<-temp87[vars]
ttemp88<-temp88[vars]
ttemp89<-temp89[vars]
ttemp90<-temp90[vars]
ttemp91<-temp91[vars]
ttemp92<-temp92[vars]
ttemp93<-temp93[vars]
ttemp94<-temp94[vars]
ttemp95<-temp95[vars]
ttemp96<-temp96[vars]

make_[i] <<- rbind(ttemp84,ttemp85,ttemp86,ttemp87,ttemp88,ttemp89,ttemp90,ttemp91,ttemp92,ttemp93,ttemp94,ttemp95,ttemp96)

#Here is where I am having trouble, I want to make
#a new data structure for every iteration of the for
#loop and have it called make_i for each iteration of i

coding <- coding+1

}

}
As I wrote in the comments above, I am running into trouble turning each iteration into a new data structure that has a unique name. I know what I have is definitely not the way to do it but I don't have the foggiest of how to do it. For example, I want to have a data structure called make_8 that has all of the observations for which MAKE == 8 in the fars84:fars96 data sets, similarly for make_9 etc.

Any help would be greatly appreciated.

Edit: problem solved using assign()

Last edited:

#### Dason

You mention that you solved your problem using assign. A better way to do it that avoids some of that complication would be to just use a list to store the data.

#### bryangoodrich

##### Probably A Mammal
Yeah, you need to learn about R lists. That long rbind command could be accomplished with a simple

Code:
do.call("rbind", myListObject)
Easy? Right. Just need to create that list object. Each element of which would be one of those data frames that composes the final data set. There's a number of ways to go about it, but it really depends on how your data is stored originally. One way, that is not necessarily a good way, is similar to your approach

Code:
myListObject <- list()
myListObject$someframe <- ... myListObject$someotherframe <- ...
...
df <- do.call("rbind", myListObject)
There's really no reason to create named list elements in this case, though. Instead, myListObject[[1]] <- ..., etc. would suffice (different story if you were cbinding stuff).

#### bryangoodrich

##### Probably A Mammal
A better approach is to use factors. Get all your data into one data frame with a factor column MAKE. Then if you need to operate on the frame for a given make, either temporarily subset it as you know how, or if you're uniformly operating on each one for a given make, either store the data in a list using

Code:
myList <- split(myFrame, myFrame$MAKE) # Creates a list with data frame elements for each MAKE Then you can use things like lapply to operate on it (avoid for loops here). Alternatively, you can just work off of the frame in the same way with tapply or by, which do the splitting for you Code: myResults <- tapply(myFrame, myFrame$MAKE, someFunctionToDoStuff)