+ Reply to Thread
Results 1 to 13 of 13

Thread: generate n length proportion (sum to 1)

  1. #1
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    generate n length proportion (sum to 1)




    All right that title was terrible but I didn't know what to name it.

    Basically I want to make a function that makes an n length vector of random proportions that sum to 1 (100%). I can do it if I know n and fix the function at at certain length of the vector but not if I allow n to be free. I'd really like to use an apply family solution (or non loop if there's some solution I can't for see using neither loop or apply) but if that's not possible a loop is fin (I'd actually like to see both as it'll help with the thinking):

    A forced n (works but not what I want
    Code: 
    p <- function(){
        v <- sample(seq(0, 1, by=.01), 1)
        w <- sample(seq(0, 1-v, by=.01), 1)
        x <- sample(seq(0, 1-(v + w), by=.01), 1)
        y <- sample(seq(0, 1-(v + w + x), by=.01), 1)
        z <- round(1-(v + w + x + y), 2)
        c(v, w, x, y, z)
    }
    
    p()
    an attempt to use global assignment to generate the function
    I'm actually not sure why this approach doesn't work
    Code: 
    n<-4
    y <- 0
    sapply(seq_len(n), function(i) {
            x <- sample(seq(0, 1-y, by=.01), 1)
            y <<- y + x
        }
    )
    What I'd like to get:
    Code: 
    p(n=4)
    [1] 0.24 0.01 0.50 0.25
    
    p(n=4)
    [1] 0.16 0.05 0.70 0.09
    
    p(n=5)
    [1] 0.30 0.49 0.15 0.01 0.05
    In my code I restricted the sampling to the hundreds place but that doesn't have to be the case, it just seemed like an easy approach.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  2. #2
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    Re: generate n length proportion (sum to 1)

    Ohhhhh.............................!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    You dope trinker:

    Code: 
    n<-4
    y <- 0
    sapply(seq_len(n), function(i) {
            x <- sample(seq(0, 1-y, by=.01), 1)
            y <<- y + x
            return(x)  # I needed this guy
        }
    )
    Still interested in a loop solution. I may play with it myself to see if I can get it since it seems a loop isn't much different than what I did.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  3. #3
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    Re: generate n length proportion (sum to 1)

    Nope not working the way I want it to. In mine I don't need to use the I but I do in the loop and I don't know how.

    Code: 
    n<-4
    y <- 0
    for(i in 1:n){
            x <- sample(seq(0, 1-y, by=.01), 1)
            y <- y + x
    }
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  4. #4
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    Re: generate n length proportion (sum to 1)

    I thought I had it but I don't

    Please help again. This is what I got. It spits ot the correct number of proportions but they don't sum to 1. And for n=1 it gives this error:

    Code: 
    p <- function(n){
        y <- 0
        z <- sapply(seq_len(n-1), function(i) {
                x <- sample(seq(0, 1-y, by=.01), 1)
                y <<- y + x
                return(x)
            }
        )
        w <- c(z ,sample(seq(0, 1-sum(z), by=.01), 1))
        return(w)
    }
    Code: 
    > p(1)
    Error in sum(z) : invalid 'type' (list) of argument
    Where as I'd expect it to be 1.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  5. #5
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    Re: generate n length proportion (sum to 1)

    Duh again:

    Code: 
    p <- function(n){
        y <- 0
        z <- sapply(seq_len(n-1), function(i) {
                x <- sample(seq(0, 1-y, by=.01), 1)
                y <<- y + x
                return(x)
            }
        )
        w <- c(z , 1-sum(z))
        return(w)
    }
    Still the length zero doesn't work.
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  6. #6
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    Re: generate n length proportion (sum to 1)

    Alright this is it:

    Code: 
    p <- function(n){
        if (n < 2) stop("n must be greater than 1")
        y <- 0
        z <- sapply(seq_len(n-1), function(i) {
                x <- sample(seq(0, 1-y, by=.01), 1)
                y <<- y + x
                return(x)
            }
        )
        w <- c(z , 1-sum(z))
        return(w)
    }
    Having length 1 is silly anyway. I could do an if else but it makes no sense for the purposes I want this for.

    Thanks for the help everybody Sorry for polluting TS with a thread I could have solved if I slowed down a bit but maybe someone will learn from this. The for loop way would still interest me as I want to learn looping better (I know I'll need it as I mope to other languages).
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  7. #7
    Beep
    Points: 61,011, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardCommunity AwardMaster TaggerFrequent Poster
    Dason's Avatar
    Location
    Ames, IA
    Posts
    11,042
    Thanks
    260
    Thanked 2,135 Times in 1,816 Posts

    Re: generate n length proportion (sum to 1)

    Do you necessarily want the stick breaking method to be used to generate your proportions?

    Otherwise you could make your life a lot easier...

    Code: 
    n <- 5
    # or whatever random number generator you want that only gives positives
    tmp <- rgamma(5, 1, 1) 
    tmp <- tmp/sum(tmp)
    # tmp now contains stuff that sums to 1
    Morte a tutti i raptors
    001100010010011110100001101101110011

  8. #8
    Probably A Mammal
    Points: 18,660, Level: 86
    Level completed: 62%, Points required for next Level: 190
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,194
    Thanks
    291
    Thanked 491 Times in 446 Posts

    Re: generate n length proportion (sum to 1)

    ****, that was gonna be my answer. Just make random numbers, sum them to get a total and then treat each number as a proportion of that total as Dason aptly demonstrated with rgamma. Though, there may be the problem with rounding. I'm assuming the accuracy of this processing isn't that dire, however!

  9. #9
    Beep
    Points: 61,011, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardCommunity AwardMaster TaggerFrequent Poster
    Dason's Avatar
    Location
    Ames, IA
    Posts
    11,042
    Thanks
    260
    Thanked 2,135 Times in 1,816 Posts

    Re: generate n length proportion (sum to 1)

    Also note that my algorithm (using rgamma) produces a special case of draws from a Dirichlet distribution.
    Morte a tutti i raptors
    001100010010011110100001101101110011

  10. #10
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    Re: generate n length proportion (sum to 1)

    Where the heck were you when I was running around like a chicken with my head cut off

    Nice solution
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

  11. #11
    Probably A Mammal
    Points: 18,660, Level: 86
    Level completed: 62%, Points required for next Level: 190
    bryangoodrich's Avatar
    Location
    Sacramento, California, United States
    Posts
    2,194
    Thanks
    291
    Thanked 491 Times in 446 Posts

    Re: generate n length proportion (sum to 1)

    A few hours behind you? Some of us needed to catch up on sleep. Actually, I was walking through my presentation which turned out to be WAY longer than anticipated.

  12. #12
    Points: 2,737, Level: 31
    Level completed: 92%, Points required for next Level: 13

    Posts
    39
    Thanks
    6
    Thanked 0 Times in 0 Posts

    Re: generate n length proportion (sum to 1)

    A dirichlet draw was exactly what I was going to suggest for this! Day late and a dollar short, story of my life...

  13. #13
    ggplot2orBust
    Points: 34,836, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    User with most referrers
    trinker's Avatar
    Location
    Buffalo, NY
    Posts
    3,934
    Thanks
    1,348
    Thanked 745 Times in 668 Posts

    Re: generate n length proportion (sum to 1)


    To get it to be exactly 1 for rowSums I had to modify it in this way (because of rounding):

    Code: 
    props2 <- function(nrow=10, ncol=5, var.names=NULL, digits=2){     
        p <- function(n, digits){                                      
            tmp <- rgamma(n, 1, 1)                                     
            X <- round(tmp/sum(tmp), digits=digits)                    
            if (sum(X)!=1) {                                           
                o <- diff(c(1, sum(X)))                                
                X[which.max(X)] <- max(X)-o                            
            }                                                          
            return(X)                                                  
        }                                                              
        DF <- data.frame(t(replicate(nrow, p(n=ncol, digits=digits)))) 
        if (!is.null(var.names)) colnames(DF) <- var.names             
        return(DF)                                                     
    }
    "If you torture the data long enough it will eventually confess."
    -Ronald Harry Coase -

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats