# Recent content by randomcat

1. ### What are some data mining techniques for analyzing cause of disease

I have a dataset of 300 observations, of which 200 are normal, and the rest have the disease. I have the cognitive assessment scores of these 300 participants, and the assessment is divided into different sections: delusions, depression, anxiety, etc. I'm wondering what technique(s) would be...
2. ### Bagging predictions with binary response variable in R

I am trying to use the bagging technique to increase my model's predictive power. My response variable, status, is a binary variable where 0 indicates no disease and 1 indicates disease. The variable status is just a vector of repeating 0's and 1's (so its class is 'numeric' not 'factor'). Not...
3. ### How to view the resulting tree using the bagging function in R?

I constructed a tree with the rpart function. Then I can plot it to look at the tree visually and also look at what % of the observations were classified correctly using table(predict(...), ...). mytree=rpart(y~x1+x2+x3+x4, method="class") plot(mytree) text(mytree)...
4. ### rpart function: how to know the % of correct classification at every terminal node?

I have a dataset with 277 observations.I have binary response variables i.e, 0 indicates no disease, and 1 indicates disease. I know that 180 of the observations have no disease and the 97 have the disease. I build a model and construct a classification tree to see how well my model correctly...
5. ### How to use boxcox function in R

I run the following code in R: boxcox(data, lambda = seq(-2,2), interp=TRUE, plotit=TRUE) Where data is a vector of integers, but I get the error Error: \$ operator is invalid for atomic vectors How can I fix this? Furthermore, how can I specify how much I want the lambda to increment by?
6. ### How to calculate p-value of two-sample t-test

I have 2 independent data sets, and I know the following about each of them: mean, SD, and sample size. I calculated the t-statistic just fine my.t.test<-function(mu1, mu2, sd1, sd2, n1,n2){ t=(mu1-mu2)/sqrt((sd1)^2/n1+(sd2)^2/n2) return(t) } I know that the degrees of freedom...
7. ### Which programming language/database to learn?

I'm interested in pursuing a career in biostatistics, and I'm wondering which language/database will be useful in the field of biostatistics/epidemiology. As of now I only have a basic knowledge of C an R. Thank you.
8. ### What courses to take as an undergrad if I want to pursue a Master's/PhD?

Hi, I'm currently a freshman at a large research university in California. My school offers 3 B.S. degree options: general statistics, applied statistics, and computational statistics. At first, I chose applied statistics because I hope to pursue a career in epidemiology or biostatistics in the...