+ Reply to Thread
Results 1 to 5 of 5

Thread: Help on nested for loop please!!!

  1. #1
    Points: 38, Level: 1
    Level completed: 76%, Points required for next Level: 12

    Posts
    2
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Help on nested for loop please!!!




    Dear R experts,
    I am struggling with my raw data. I am trying to filter data by a nested loop, it has been running for days. I think my loop function is not optimal.

    I have two sets of data frame, one is a list of event dates of 100 companies during 10 years (6796 obs) as in this photo:


    the other set is the list of trade dates of those 100 firms (19,523 obs). In this data set, there are 2 N.A variables (e_windows and e_dates) that I want to fill in after filtering data.


    My goal is to filter all the instrds$Trade.date which were made before events$Date from 1-40 days and then fill the difference numbers in instrds$e_windows and the event dates which are satisfied the loop condition in instrds$e_dates

    The code I used for this is
    Code: 
    for(i in 1:nrow(events)) {
      for(j in 1:nrow(instrds)) {
        if(events$Date[i] - instrds$Trade.date[j]>0 & events$Date[i] - instrds$Trade.date[j] <=40 & instrds$Company[j] == events$Ticker[i]){
          instrds$e_windows[j] = events$Date[i] - instrds$Trade.date[j]; instrds$e_dates[j] = events$Date[i]}
      }
    }
    However, It has been taking too long time to finish.
    Could you please help me if there is any solution for this?

    Thanks in advance,
    tobi
    Last edited by tobi; 09-07-2017 at 05:08 AM.

  2. #2
    Points: 8,120, Level: 60
    Level completed: 85%, Points required for next Level: 30

    Posts
    169
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: Help on nested for loop please!!!

    From what you have posted, I cannot overview, if your algorithms could be shortened.

    Typical approaches to speed R code up, are
    • Avoid R looping by apply and its derivatives
    • Do not concatenate to data in a loop (define the whole data result before you start looping and within the loop assign to already defined empty data, instead)
    • use vector, matrix data type instead of data.frame
    • use compile from package compile
    • use for each loop to gain multi-core computing

    Consuli
    Prediction is very difficult, especially about the future. (Niels Bohr)

  3. The Following User Says Thank You to consuli For This Useful Post:

    tobi (09-18-2017)

  4. #3
    Probably A Mammal
    Points: 32,065, Level: 100
    Level completed: 0%, Points required for next Level: 0
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,567
    Thanks
    398
    Thanked 618 Times in 551 Posts

    Re: Help on nested for loop please!!!

    Your explanation needs more clarity, as I don't know precisely what you're trying to do. In any case, it doesn't look like you need to do for loops at all. R is a vectorized language, that provides means to extract and assign values to data frames ("tables") by operating on entire columns (vectors) of data. There is no reason to then go row-by-row to check something and do something. You want to abstract to what you're doing to the entire column vector as a whole.

    To help you, I suggest you shrink your problem set to something that demonstrates what you are trying to do. We don't have your data, but you can make a smaller version of your problem set. You can output it and provide that here by using the dput function (dput your data frame and provide us the output). Show us what a simplified version of your problem looks like and what you expect the output from that to look like. In doing that, you may yourself better understand your own problem, so it is good practice to do, regardless.
    You should definitely use jQuery. It's really great and does all things.

  5. The Following User Says Thank You to bryangoodrich For This Useful Post:

    tobi (09-18-2017)

  6. #4
    Points: 38, Level: 1
    Level completed: 76%, Points required for next Level: 12

    Posts
    2
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Help on nested for loop please!!!

    Thank you guys for your detailed comments. I am sorry for my bad expression making you not to understand
    I have found the solution myself and still used loops. Like you said to avoid using loops to speed up R code. Therefore, I hope I could have better choice for my data preparation.

    To clearly understand my problem, I would like to describe my issue again: (I created my small sample in excel for easy to observe)
    I have to datasets which are Data1 and Data2. Data1 has 2 variables 'ticker' and 'tradedate'. Data2 has 2variables 'Ticker' and 'Date'.
    I want to find all the dates in Data1$tradedate which are made before Data2$Date from 1-40day (with the condition that values in Data1$ticker match with Data2$Ticker). If these logical conditions are satisfied then fill in Data1$ewindow with numbers of different dates and Data1$edate with date value in Data2$Date. The final data needed is dataframe Data1
    I have 25000obs in data1 and 8000 obs in data2. It would take time if I dont use loop. I though

    This is my data


    I use my code:
    Code: 
    for(i in 1:nrow(Data2)) {
      n = min(which(Data1$ticker == Data2$Ticker[i]))
      m = n-1 + nrow(subset(it, Data1$ticker == Data2$Ticker[i] ))
      for(j in n:m) {
        if (Data2$Date[i] - Data1$tradedate[j]>0 & Data2$Date[i] - Data1$tradedate[j] <=40){
          Data1$ewindows[j] = Data2$Date[i] - Data1$tradedate[j]; Data1$edate[j] = Data2$Date[i]
        }
      }
    }
    This is my result after running the code.



    This solution might not be the optimal solution. So , if you have any idea for this please help.
    thank you somuch
    Last edited by tobi; 09-19-2017 at 09:38 AM.

  7. #5
    Points: 8,120, Level: 60
    Level completed: 85%, Points required for next Level: 30

    Posts
    169
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: Help on nested for loop please!!!


    Quote Originally Posted by bryangoodrich View Post
    There is no reason to then go row-by-row to check something and do something. You want to abstract to what you're doing to the entire column vector as a whole.
    True.
    And when you have several columns or output-list, you use apply() and its derivatives for iterate on them. Loop iteration is in R usually only required, when something has to be performed conditionally.

    Quote Originally Posted by bryangoodrich View Post
    Show us what a simplified version of your problem looks like and what you expect the output from that to look like. In doing that, you may yourself better understand your own problem, so it is good practice to do, regardless.
    Exactly.
    Prediction is very difficult, especially about the future. (Niels Bohr)

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats