# generate n length proportion (sum to 1)

#### trinker

##### ggplot2orBust
All right that title was terrible but I didn't know what to name it.

Basically I want to make a function that makes an n length vector of random proportions that sum to 1 (100%). I can do it if I know n and fix the function at at certain length of the vector but not if I allow n to be free. I'd really like to use an apply family solution (or non loop if there's some solution I can't for see using neither loop or apply) but if that's not possible a loop is fin (I'd actually like to see both as it'll help with the thinking):

A forced n (works but not what I want
Code:
p <- function(){
v <- sample(seq(0, 1, by=.01), 1)
w <- sample(seq(0, 1-v, by=.01), 1)
x <- sample(seq(0, 1-(v + w), by=.01), 1)
y <- sample(seq(0, 1-(v + w + x), by=.01), 1)
z <- round(1-(v + w + x + y), 2)
c(v, w, x, y, z)
}

p()
an attempt to use global assignment to generate the function
I'm actually not sure why this approach doesn't work
Code:
n<-4
y <- 0
sapply(seq_len(n), function(i) {
x <- sample(seq(0, 1-y, by=.01), 1)
y <<- y + x
}
)
What I'd like to get:
Code:
p(n=4)
[1] 0.24 0.01 0.50 0.25

p(n=4)
[1] 0.16 0.05 0.70 0.09

p(n=5)
[1] 0.30 0.49 0.15 0.01 0.05
In my code I restricted the sampling to the hundreds place but that doesn't have to be the case, it just seemed like an easy approach.

#### trinker

##### ggplot2orBust
Ohhhhh.............................!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

You dope trinker:

Code:
n<-4
y <- 0
sapply(seq_len(n), function(i) {
x <- sample(seq(0, 1-y, by=.01), 1)
y <<- y + x
return(x)  [COLOR="red"]# I needed this guy[/COLOR]
}
)
Still interested in a loop solution. I may play with it myself to see if I can get it since it seems a loop isn't much different than what I did.

#### trinker

##### ggplot2orBust
Nope not working the way I want it to. In mine I don't need to use the I but I do in the loop and I don't know how.

Code:
n<-4
y <- 0
for(i in 1:n){
x <- sample(seq(0, 1-y, by=.01), 1)
y <- y + x
}

#### trinker

##### ggplot2orBust
I thought I had it but I don't

Please help again. This is what I got. It spits ot the correct number of proportions but they don't sum to 1. And for n=1 it gives this error:

Code:
p <- function(n){
y <- 0
z <- sapply(seq_len(n-1), function(i) {
x <- sample(seq(0, 1-y, by=.01), 1)
y <<- y + x
return(x)
}
)
w <- c(z ,sample(seq(0, 1-sum(z), by=.01), 1))
return(w)
}
Code:
> p(1)
Error in sum(z) : invalid 'type' (list) of argument
Where as I'd expect it to be 1.

#### trinker

##### ggplot2orBust
Duh again:

Code:
p <- function(n){
y <- 0
z <- sapply(seq_len(n-1), function(i) {
x <- sample(seq(0, 1-y, by=.01), 1)
y <<- y + x
return(x)
}
)
w <- c(z , 1-sum(z))
return(w)
}
Still the length zero doesn't work.

#### trinker

##### ggplot2orBust
Alright this is it:

Code:
p <- function(n){
if (n < 2) stop("n must be greater than 1")
y <- 0
z <- sapply(seq_len(n-1), function(i) {
x <- sample(seq(0, 1-y, by=.01), 1)
y <<- y + x
return(x)
}
)
w <- c(z , 1-sum(z))
return(w)
}
Having length 1 is silly anyway. I could do an if else but it makes no sense for the purposes I want this for.

Thanks for the help everybody Sorry for polluting TS with a thread I could have solved if I slowed down a bit but maybe someone will learn from this. The for loop way would still interest me as I want to learn looping better (I know I'll need it as I mope to other languages).

#### Dason

Do you necessarily want the stick breaking method to be used to generate your proportions?

Otherwise you could make your life a lot easier...

Code:
n <- 5
# or whatever random number generator you want that only gives positives
tmp <- rgamma(5, 1, 1)
tmp <- tmp/sum(tmp)
# tmp now contains stuff that sums to 1

#### bryangoodrich

##### Probably A Mammal
****, that was gonna be my answer. Just make random numbers, sum them to get a total and then treat each number as a proportion of that total as Dason aptly demonstrated with rgamma. Though, there may be the problem with rounding. I'm assuming the accuracy of this processing isn't that dire, however!

#### Dason

Also note that my algorithm (using rgamma) produces a special case of draws from a Dirichlet distribution.

#### trinker

##### ggplot2orBust
Where the heck were you when I was running around like a chicken with my head cut off

Nice solution

#### bryangoodrich

##### Probably A Mammal
A few hours behind you? Some of us needed to catch up on sleep. Actually, I was walking through my presentation which turned out to be WAY longer than anticipated.

#### cruzeconomics

##### New Member
A dirichlet draw was exactly what I was going to suggest for this! Day late and a dollar short, story of my life...

#### trinker

##### ggplot2orBust
To get it to be exactly 1 for rowSums I had to modify it in this way (because of rounding):

Code:
props2 <- function(nrow=10, ncol=5, var.names=NULL, digits=2){
p <- function(n, digits){
tmp <- rgamma(n, 1, 1)
X <- round(tmp/sum(tmp), digits=digits)
if (sum(X)!=1) {
o <- diff(c(1, sum(X)))
X[which.max(X)] <- max(X)-o
}
return(X)
}
DF <- data.frame(t(replicate(nrow, p(n=ncol, digits=digits))))
if (!is.null(var.names)) colnames(DF) <- var.names
return(DF)
}