+ Reply to Thread
Page 3 of 27 FirstFirst 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 ... LastLast
Results 31 to 45 of 392

Thread: Today I Learned: ____

  1. #31
    Probably A Mammal
    Points: 32,065, Level: 100
    Level completed: 0%, Points required for next Level: 0
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,567
    Thanks
    398
    Thanked 618 Times in 551 Posts

    Re: Today I Learned: ____




    I think TheEcologist's campaign is coming to a close. He will claim victory very soon ...

  2. #32
    TS Contributor
    Points: 13,936, Level: 76
    Level completed: 72%, Points required for next Level: 114
    jpkelley's Avatar
    Location
    Vancouver, BC, Canada
    Posts
    440
    Thanks
    17
    Thanked 90 Times in 84 Posts

    Re: Today I Learned: ____

    Who's left standing?

  3. #33
    Devorador de queso
    Points: 97,539, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent PosterActivity Award
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,987
    Thanks
    309
    Thanked 2,640 Times in 2,255 Posts

    Re: Today I Learned: ____

    I use ggplot for convenience. But I'm mainly of TE's mindset - I should know how to do anything I want using base. I just save time using ggplot2.

  4. #34
    ggplot2orBust
    Points: 72,900, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,424
    Thanks
    1,815
    Thanked 931 Times in 812 Posts

    Re: Today I Learned: ____

    I love base and ggplot for different reasons. They're different tools for different jobs. I do construction work around the house and have a nice array of tools. I just bought a new osculating tool that saws things off flush and used it last night to saw off some trim flush. I'm glad I have this tool. It's awesome and saves time. I won't be using it to pound in nails anytime soon though. I'm not an either or kinda guy. I'm a pragmatist, best tool for the job. I'm going to eventually learn lattice as well because I believe under certain circumstances this is the best tool for the job. Why limit yourself to one visualization technique?

    I stand firm and tall; after all a velociraptor's gotta stand for something.


    I'm calling out the TE.

    Code: 
    library(data.table)
    begin.time <- Sys.time()
    require(ggplot2)
    p <- ggplot(mtcars, aes(hp, mpg)) 
    p + geom_point(aes(colour=as.factor(gear))) + facet_grid(cyl~gear, margins=T)
    timetaken(begin.time)
    Click image for larger version

Name:	challenge.jpeg
Views:	36
Size:	30.8 KB
ID:	1716
    Code: 
    > timetaken(begin.time)
    [1] "00:01:55"
    Challenge: gather the same visualization information in base in less than 1 min 55 sec
    Attached Images
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  5. #35
    Devorador de queso
    Points: 97,539, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent PosterActivity Award
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,987
    Thanks
    309
    Thanked 2,640 Times in 2,255 Posts

    Re: Today I Learned: ____

    It only took my computer about 4 seconds. Also you could do the timing without the "timetaken" function that I'm assuming is why you included data.table?

    Code: 
    start <- Sys.time()
    # code code
    Sys.sleep(1)
    end <- Sys.time()
    end - start

  6. #36
    Cookie Scientist
    Points: 13,806, Level: 76
    Level completed: 39%, Points required for next Level: 244
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,297
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Today I Learned: ____

    I have spent longer than 1:55 looking at the plot and I'm still not completely sure what is going on. Maybe this is not such a good challenge.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  7. #37
    ggplot2orBust
    Points: 72,900, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,424
    Thanks
    1,815
    Thanked 931 Times in 812 Posts

    Re: Today I Learned: ____

    Quote Originally Posted by DASON
    It only took my computer about 4 seconds. Also you could do the timing without the "timetaken" function that I'm assuming is why you included data.table?
    Yes the timer was to see how long it took me to write the code and plot it
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  8. #38
    ggplot2orBust
    Points: 72,900, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    4,424
    Thanks
    1,815
    Thanked 931 Times in 812 Posts

    Re: Today I Learned: ____

    Quote Originally Posted by Jake
    I have spent longer than 1:55 looking at the plot and I'm still not completely sure what is going on. Maybe this is not such a good challenge.
    Probably, partly because I made a mistake in using gear for color instead of carb

    Jake I think a better way to understand the graph is to study the code:

    Code: 
    require(ggplot2)
    p <- ggplot(mtcars, aes(hp, mpg)) 
    p + geom_point() #just a plot of hp and mpg
    p + geom_point(aes(colour=as.factor(carb)))  # a plot colored by carb (I meant to do this in my plot above but used gear instead
    p + geom_point(aes(colour=as.factor(carb))) + facet_grid(cyl~gear)  #facet plot of both cyl and gear
    p + geom_point(aes(colour=as.factor(carb))) + facet_grid(cyl~gear, margins=T)   #same as before but with the margins
    I'm not sure if you're a user of ggplot but the faceting I used is pretty familiar to ggplot users. For me it's one of the most useful aspects of ggplot.

    The plot is a facet plot that works similarly to reshape. The graph is not publishable, but for me, allows me to quickly look at plots of data by group (and includes margins) and look for trends. If you're familiar with ggplot's faceting then the graph is pretty understandable, but again the plot is one I use frequently for initial data exploration. It's not polished but to do the same thing in base would take an exorbitant amount of time. As a researcher this tool has been invaluable.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  9. #39
    Probably A Mammal
    Points: 32,065, Level: 100
    Level completed: 0%, Points required for next Level: 0
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,567
    Thanks
    398
    Thanked 618 Times in 551 Posts

    Re: Today I Learned: ____

    Don't think you need to as.factor your color aesthetic. I think it does it on-the-fly.

  10. #40
    Probably A Mammal
    Points: 32,065, Level: 100
    Level completed: 0%, Points required for next Level: 0
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,567
    Thanks
    398
    Thanked 618 Times in 551 Posts

    Re: Today I Learned: ____

    In Python there is a nice way to index that would have obvious conflicts in R. This is the negative indexing. In Python, I can say something like someList[-2] to access the second to last element of the list, or to grab the last two elements I could say someList[-2:]. This is like sequence indexing in R on a vector: someVector[2:4]. In Python, though, the empty part of the sequence implies "to the end." In R, you have to be explicit

    Code: 
    someList[2:]  # Python
    someVector[2:length(someVector)] # R
    In R, negatives are treated as "exclude" indicators. Thus, someVector[-2] would mean include all values except 2. In Python, as explained, it would grab that distance from the end of the vector.

    What I learned today is that head has some nice properties that can mimic this Python behavior! I know that you can do something like

    Code: 
    head(x, 20)  # Reveals 20 elements instead of default 5
    What I did not realize is that the number can be a negative amount to exclude from the end.

    Code: 
    head(x, -5)  # All but last 5 elements
    Be creative. There are uses for this!

  11. The Following User Says Thank You to bryangoodrich For This Useful Post:

    trinker (01-11-2012)

  12. #41
    Devorador de queso
    Points: 97,539, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent PosterActivity Award
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,987
    Thanks
    309
    Thanked 2,640 Times in 2,255 Posts

    Re: Today I Learned: ____

    Oh I didn't know you could use negative indexing in head! Very nice.

    Also - TIL: Vectorize is awesome. Seriously. How many times have you written a function and realized later that you didn't write it to be vector friendly but you wanted to use it in a way that requires it to be vector friendly? Usually I just rewrite the function more intelligently (which isn't a bad thing to do...) but you could also (apparently) just toss it into Vectorize. Awesome.

  13. The Following User Says Thank You to Dason For This Useful Post:

    trinker (01-11-2012)

  14. #42
    Probably A Mammal
    Points: 32,065, Level: 100
    Level completed: 0%, Points required for next Level: 0
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,567
    Thanks
    398
    Thanked 618 Times in 551 Posts

    Re: Today I Learned: ____

    I also learned how to take a vector of widths and create start and end points in a vector based on those widths (good for manipulating fixed-width data). I think back to my C programming days, and you would do a simple for loop, and keep a running total of the positions. So if you have a string "Ilikecheeseandpie" and want to break it up by the vector c(1, 4, 6, 3, 3), you'd need the vector of positions in the string, not the width of each word: c(1, 2, 6, 11, 14, 17). To generate this, just think about what we need to do. Start with the first location (1). Then we add to it the first width (= 1 + 1). Then we add the next width (= 2 + 4), and so on. In the procedural way I used (very slow in R), you would have to keep track of these positions. I would usually make two vectors: start and end. I also was shown by this example that you could simply create the start positions. The end positions are just the start plus the widths (vectorized operation, adds a little speed). Creating the start vector, though, is so simple using cumsum, which I never think about its importance.

    Code: 
    cumsum(c(1, widths))
    generates all your start positions, but you don't need the last one. Thus, with the above tool, we could tie it all together.

    Code: 
    head(cumsum(c(1, widths)), -1)
    Beautiful!

  15. #43
    TS Contributor
    Points: 13,936, Level: 76
    Level completed: 72%, Points required for next Level: 114
    jpkelley's Avatar
    Location
    Vancouver, BC, Canada
    Posts
    440
    Thanks
    17
    Thanked 90 Times in 84 Posts

    Re: Today I Learned: ____

    Code: 
    Oh I didn't know you could use negative indexing in head!
    Neither did I. Clever.

  16. #44
    Devorador de queso
    Points: 97,539, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent PosterActivity Award
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,987
    Thanks
    309
    Thanked 2,640 Times in 2,255 Posts

    Re: Today I Learned: ____

    Better yet:
    Code: 
    > x <- rnorm(100000)
    > benchmark(head(x, -1), x[-length(x)])
               test replications elapsed relative user.self sys.self user.child sys.child
    1   head(x, -1)          100    0.23 1.000000      0.24        0         NA        NA
    2 x[-length(x)]          100    0.62 2.695652      0.63        0         NA        NA
    I like the version using head more AND it appears to be faster. Double nice.

  17. #45
    Probably A Mammal
    Points: 32,065, Level: 100
    Level completed: 0%, Points required for next Level: 0
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,567
    Thanks
    398
    Thanked 618 Times in 551 Posts

    Re: Today I Learned: ____


    I got similar results, but I also tested if we're excluding more than just the end. Turns out, efficiency is lost! It's pretty close regardless, and doesn't change dramatically if we're excluding 2 or 2000.

    Code: 
    > benchmark(head(z, -2), z[1:(length(z)-2)], replications=1000)
                      test replications elapsed relative user.self sys.self user.child sys.child
    1          head(z, -2)         1000    3.60 1.034483      2.98     0.61         NA        NA
    2 z[1:(length(z) - 2)]         1000    3.48 1.000000      2.97     0.52         NA        NA
    
    > benchmark(head(z, -1), z[-length(z)], replications=1000)
               test replications elapsed relative user.self sys.self user.child sys.child
    1   head(z, -1)         1000    3.50 1.000000      2.97     0.52         NA        NA
    2 z[-length(z)]         1000    3.77 1.077143      3.20     0.54         NA        NA

+ Reply to Thread
Page 3 of 27 FirstFirst 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 ... LastLast

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats