Hi,
Originally, I thought this might be appropriate for the R thread, but I think this may have more general relevance.
I'm trying to decide on a data visualization option, and my right and left sides of my brain are at war. In brief, I've modeled predator densities at two sites (in a zero-inflated GLMM framework). I then simulated 10000 random draws from the distributions parameter estimates and their standard errors. Two options at this point:
1) Boxplot with the >1.5 IQR values excluded (since these data come from a simulation). Note that 10000 random draws stabilized the whisker lengths.
2) Simple plot of median +/- 95% "confidence" intervals.
I like 95% CIs better, but I also prefer the boxplot visualization for the fact it shows the 25/75 percentiles. Any preference? You can see they almost align. Also, I'd rather not deviate too much from a classical Tukey-style boxplot, if that's the recommendation. I get twitchy around "modified" boxplots, since I've run across people who have a difficult time interpreting them.
So, again, any preference...or is this a case of splitting hairs on a frog?
Best,
Patrick
P.S. In case any of you want to play with the code used to produce these plots, here it is:
Originally, I thought this might be appropriate for the R thread, but I think this may have more general relevance.
I'm trying to decide on a data visualization option, and my right and left sides of my brain are at war. In brief, I've modeled predator densities at two sites (in a zero-inflated GLMM framework). I then simulated 10000 random draws from the distributions parameter estimates and their standard errors. Two options at this point:
1) Boxplot with the >1.5 IQR values excluded (since these data come from a simulation). Note that 10000 random draws stabilized the whisker lengths.
2) Simple plot of median +/- 95% "confidence" intervals.
I like 95% CIs better, but I also prefer the boxplot visualization for the fact it shows the 25/75 percentiles. Any preference? You can see they almost align. Also, I'd rather not deviate too much from a classical Tukey-style boxplot, if that's the recommendation. I get twitchy around "modified" boxplots, since I've run across people who have a difficult time interpreting them.
So, again, any preference...or is this a case of splitting hairs on a frog?
Best,
Patrick
P.S. In case any of you want to play with the code used to produce these plots, here it is:
Code:
## simulate data
siteA<-exp(rnorm(10000, mean=-0.950701217, sd=0.698376574))
siteB<-exp(rnorm(10000, mean=0.169387235 , sd=0.424403903))
df<-data.frame(site=rep(c("siteA", "siteB"), each=10000), counts=c(siteA, siteB))
## Calculate 95% "confidence" intervals
siteA.qnt<-quantile(siteA, probs=c(0.025, 0.50, 0.975))
siteB.qnt<-quantile(siteB, probs=c(0.025, 0.50, 0.975))
qnt<-rbind(siteA.qnt, siteB.qnt)
df2<-data.frame(site=c("siteA", "siteB"), qnt)
colnames(df2)<-c("site", "lower", "median", "upper")
## boxplot with outlier (>1.5 IQR suppressed, since this is visualization from a simulation)
p<-ggplot(data=df, aes(x=site, y=counts))
p+geom_boxplot(outlier.colour="NA")
last_plot()+scale_y_continuous(limits=c(0,3)) ## warning message expected
##
limits <- aes(ymax = upper, ymin=lower)
p <- ggplot(df2, aes(x=site, y=median))
p+geom_point(aes(x=site, y=median))
last_plot()+geom_errorbar(aes(ymax=upper, ymin=lower), width=0.2)
last_plot()+scale_y_continuous(limits=c(0,3)) ## warning message expected