+ Reply to Thread
Results 1 to 12 of 12

Thread: problem about set operation and computation after split

  1. #1
    Points: 220, Level: 4
    Level completed: 40%, Points required for next Level: 30

    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Question problem about set operation and computation after split



    hi,
    I met some problems in R, plz help me.
    1. How to do a intersect operation among several groups in one list, without a loop statement? (I think It may be a list)
    create data:
    Code: 
       myData <- data.frame(product = c(1,2,3,1,2,3,1,2,2), year=c(2009,2009,2009,2010,2010,2010,2011,2011,2011),value=c(1104,608,606,1504,508,1312,900,1100,800))
       mySplit<- split(myData,myData$year)  
       mySplit
    $`2009`
      product year value
    1       1 2009  1104
    2       2 2009   608
    3       3 2009   606
    
    $`2010`
      product year value
    4       1 2010  1504
    5       2 2010   508
    6       3 2010  1312
    
    $`2011`
      product year value
    7       1 2011   900
    8       2 2011  1100
    9       2 2011   800
    I want to get intersection of product between every year. I know the basic method is:
    Code: 
        intersect(intersect(mySplit[[1]]$product, mySplit[[2]]$product),mySplit[[3]]$product)
    this will give the correct answer:
    Code: 
        [1] 1 2
    above code lacks reusability, so It should use a for loop:
    Code: 
        myIntersect<-mySplit[[1]]$product
        for (i in 1:length(mySplit)-1){ 
            myIntersect<-intersect(myIntersect,mySplit[[i+1]]$product)
        }
    It's correct too, but stll too complex, so my question is:
    Can I do the same thing just use another similar intersect function (without for/repeat/while).
    What's this simple function's name ?

    2.how to do a relative computation after split (notice: not befor split)?
    create data:
    Code: 
       myData1 <- data.frame(product = c(1,2,3,1,2,3), year=c(2009,2009,2009,2010,2010,2010),value=c(1104,608,606,1504,508,1312),relative=0)
       mySplit1<- split(myData1,myData1$year)  
       mySplit1
    $`2009`
      product year value relative
    1       1 2009  1104        0
    2       2 2009   608        0
    3       3 2009   606        0
    
    $`2010`
      product year value relative
    4       1 2010  1504        0
    5       2 2010   508        0
    6       3 2010  1312        0
    I want compute relative value in the every group, what I mean is , I want get the result is just like below:
    Code: 
       $`2009`
      product year value relative
    1       1 2009  1104        0
    2       2 2009   608        -496
    3       3 2009   606        -2
    
    $`2010`
      product year value relative
    4       1 2010  1504        0
    5       2 2010   508        -996
    6       3 2010  1312        804
    I think to use a loop maybe work, but Is there no direct method on list?

    3.how to do a sorting after split, It's just like above question, what I want is sorting by value:
    Code: 
       $`2009`
      product year value relative
    3       3 2009   606        0
    2       2 2009   608        0
    1       1 2009  1104        0
    $`2010`
      product year value relative
    5       2 2010   508        0
    6       3 2010  1312        0
    4       1 2010  1504        0
    4. how to do a filtering after split, Yes, It's just like above quetion, what I want is filtering out data which value is more than 1000:
    Code: 
    $`2009`
      product year value relative
    1       1 2009  1104        0
    $`2010`
      product year value relative
    4       1 2010  1504        0
    6       3 2010  1312        0
    Last edited by bestbird7788; 06-06-2012 at 02:09 AM.

  2. #2
    Points: 220, Level: 4
    Level completed: 40%, Points required for next Level: 30

    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: problem about set operation and computation after split

    can anyone help me?

  3. #3
    RotParaTon
    Points: 46,248, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,080
    Thanks
    211
    Thanked 1,608 Times in 1,378 Posts

    Re: problem about set operation and computation after split

    You ask too much that is fairly simple. Try asking just one question at a time. I would also suggest visiting this thread and finding some good intro to R material.

    Here is some code that should help with your first two questions

    Code: 
    myData <- data.frame(product = c(1,2,3,1,2,3,1,2,2), year=c(2009,2009,2009,2010,2010,2010,2011,2011,2011),value=c(1104,608,606,1504,508,1312,900,1100,800))
    mySplit<- split(myData,myData$year)  
    
    # Only grab the products
    productList <- split(myData$product, myData$year)
    # Reduce repeated applies intersect
    Reduce(intersect, productList)
    
    
    for(i in seq(mySplit)){
      mySplit[[i]]$relative <- c(0, diff(mySplit[[i]]$value))
    }
    mySplit
    "His programming is malfunctioning. It begins! Get your weapons, he's going to become a killbot!!!" - bryangoodrich

  4. #4
    Points: 220, Level: 4
    Level completed: 40%, Points required for next Level: 30

    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: problem about set operation and computation after split

    Very grateful, thank you. I will follow your advice.
    answer 1 is perfect, I had never kown function Reduce, I need these structural functions like Reduce or do.call.
    answer 2 is um..... do we have to use a loop expression? I mean for expression is already very simple here, but Is there some expression like Reduce that can edge out loop?

  5. #5
    RotParaTon
    Points: 46,248, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,080
    Thanks
    211
    Thanked 1,608 Times in 1,378 Posts

    Re: problem about set operation and computation after split

    Sure there are probably ways to get rid of the loop. But can I ask you a quesiton - why?

    No offense but you don't seem to be the best with R at the moment and a loop is a very clear direct way to do what you want to do. Before you focus on optimizing the crap out of everything your focus should just be getting things done.
    "His programming is malfunctioning. It begins! Get your weapons, he's going to become a killbot!!!" - bryangoodrich

  6. #6
    Points: 220, Level: 4
    Level completed: 40%, Points required for next Level: 30

    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: problem about set operation and computation after split

    OK, It seemed I should tell you the secret: ur...., me and my small team were not good at program, some statements such as while, for, loop puzzled me.
    I want to avoid loop, but if statement and functions are acceptable.
    I'm good( I think) at EXCEL ,but not VBA, and I know business, those can explain all.

    Quote Originally Posted by Dason View Post
    Sure there are probably ways to get rid of the loop. But can I ask you a quesiton - why?

    No offense but you don't seem to be the best with R at the moment and a loop is a very clear direct way to do what you want to do. Before you focus on optimizing the crap out of everything your focus should just be getting things done.

  7. #7
    FormerlyKnownAsRaptor
    Points: 24,414, Level: 95
    Level completed: 7%, Points required for next Level: 936
    Awards:
    Activity Award
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,173
    Thanks
    882
    Thanked 551 Times in 499 Posts

    Re: problem about set operation and computation after split

    Dason's loop is likely to be pretty darn fast. You may have heard loops are slow in R. This is true if you compare a loop to a vectorized solution (not applicable in this case; I don't think) so we could use an lapply solution but the loop is likely as fast if not faster (a few releases of R ago the slow loop issue was addressed). You could compile the loop later on to gain even more speed if you wish.

    Also Reduce is pretty nice but it is also pretty slow. It's eloquence comes with a price... speed.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  8. #8
    Points: 220, Level: 4
    Level completed: 40%, Points required for next Level: 30

    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: problem about set operation and computation after split

    thanks for your suggestion.
    I will pay attention to speed/efficiency problem later.
    On my current ablility, It is more importent to handle the basic analysis method.

    Quote Originally Posted by trinker View Post
    Dason's loop is likely to be pretty darn fast. You may have heard loops are slow in R. This is true if you compare a loop to a vectorized solution (not applicable in this case; I don't think) so we could use an lapply solution but the loop is likely as fast if not faster (a few releases of R ago the slow loop issue was addressed). You could compile the loop later on to gain even more speed if you wish.

    Also Reduce is pretty nice but it is also pretty slow. It's eloquence comes with a price... speed.

  9. #9
    Points: 344, Level: 6
    Level completed: 88%, Points required for next Level: 6

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: problem about set operation and computation after split

    answer 2 is um..... do we have to use a loop expression? I mean for expression is already very simple here, but Is there some expression like Reduce that can edge out loop?
    It's hard to write the code without a loop, but check this one:
    =mySplit1.(~.dup@t().derive(if(#==1,0,value-value[-1])))
    ~means current group, # means current row number, [-1] means last row.

  10. #10
    RotParaTon
    Points: 46,248, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,080
    Thanks
    211
    Thanked 1,608 Times in 1,378 Posts

    Re: problem about set operation and computation after split

    Quote Originally Posted by datakeyword View Post
    It's hard to write the code without a loop, but check this one:
    =mySplit1.(~.dup@t().derive(if(#==1,0,value-value[-1])))
    ~means current group, # means current row number, [-1] means last row.
    I can't be the only person confused by this code. Is that supposed to be R code?
    "His programming is malfunctioning. It begins! Get your weapons, he's going to become a killbot!!!" - bryangoodrich

  11. #11
    Points: 220, Level: 4
    Level completed: 40%, Points required for next Level: 30

    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: problem about set operation and computation after split

    Quote Originally Posted by datakeyword View Post
    It's hard to write the code without a loop, but check this one:
    =mySplit1.(~.dup@t().derive(if(#==1,0,value-value[-1])))
    ~means current group, # means current row number, [-1] means last row.
    hi, datakeyword
    That looks pretty, but I just can't run it in RStudio.
    are you kidding? or It's not R code?

  12. #12
    Points: 220, Level: 4
    Level completed: 40%, Points required for next Level: 30

    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: problem about set operation and computation after split


    I googled "~.dup@t()" "derive" and I found this:http://www.******.com/forum/index.php?topic=8674.0
    I thought that's a language named ******, but I have no interest in another language, sorry for that.
    hopes that not make you in bad taste.
    I think If you show me the whole code in ****** at your free time, I will not mind to try it.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats