Increase variability

trinker

ggplot2orBust
#1
I have the need to take a variable that is normally distributed around 0 and the lower and upper bound is at -1 and 1. I need a transformation to flatten the top and widen the tails.

I just happened to think this may be a fishers z I'm looking for. If so I'll mark this thread as solved.
 

trinker

ggplot2orBust
#3
Ahh the distribution is a bit different than I had originally thought. It's bimodal. Here's a link to the data file as an R .RData file:

Code:
load(url("http://dl.dropbox.com/u/61803503/dist.RData"))
x
plot(x)
hist(x)
Here's what it looks like:


 

trinker

ggplot2orBust
#5
Yeah sorry to me it's clear. The deal is I have difference in proportions of word uses between two people. I then want to color the words in a word cloud ad a gradient based on who used the words more. The problem is right now there's so little variablity if I use read and blue as the 2 gradient colors all words are purple. I actually demo how to do this here: http://trinkerrstuff.wordpress.com/2012/11/13/gradient-word-clouds/

However that solution is for that one time. The distribution of proportion difference will change.
 

Dason

Ambassador to the humans
#8
Or you could transform the data to be an 'ideal normal spread'.

Code:
transToNorm <- function(x){
    qnorm( rank(x)/(length(x) + 1) )
}

j <- runif(1000)
hist(j)
k <- transToNorm(j)
hist(k)
 

trinker

ggplot2orBust
#9
I think I was approaching the problem all wrong. Here's the final approach I took based on your rank comment:

Code:
load(url("http://dl.dropbox.com/u/61803503/dist.RData"))
breaks <- 10
low <- x[x < 0]
high <- x[x > 0]
lcuts <- quantile(low, seq(0, 1, length.out = round(breaks/2)))
hcuts <- quantile(high, seq(0, 1, length.out = round(breaks/2)))
cts <- as.numeric(unique(sort(c(-1, lcuts, 0, hcuts, 1))))
cut(x, breaks=cts)