r programing

  1. A

    Change Point Detection using Least Sum of Squared Residuals (Two Phase Linear Regression)

    I have cumulative sum of rainfall data and would like to detect the Change Point with Least Sum of Squared Residuals (SSR) using Two Phase Linear Regression Model.Here is the data- a<-structure(list(DAY=1:200,CUMSUM=c(0.4975167,0.4975167,0.4975167,0.4975167,0.4975167...
  2. S

    Getting p-value for rsquared bootstrap in R

    I've been trying to get the pvalue of my samples from a bootstrap rsquared test in R. I'm not very good with statistics so could someone please take a look at my below code and point me in the right direction with regards to how I can extract the p-values per sample? # Bootstrap 95% CI for...
  3. rogojel

    Simulating a logistic regressio - scary results

    Hi, I tried to work out the necessary sample size for a logistic regression by simulations and got some scary results. If anyone could check the code below, it would be a great help. I simulate a logistic regression with two normalized variables, one having a fixed odds ratio of 1.4 the other a...
  4. M

    help where is the error

    I have the following code not written by me new_tum <- as.matrix(clinical[,ind_keep]) new_tum_collapsed <- c() for (i in 1:dim(new_tum)[1]){ ifsumis.na(new_tum[i,])) < dim(new_tum)[2]){ m <- min(new_tum[i,],na.rm=T) new_tum_collapsed <- c(new_tum_collapsed,m) } else {...
  5. U

    difference between estimating variance and standard deviation.

    In a simulation study, is there any difference between \bullet to estimate the variance \sigma^2, 1000 times and taking its average, and \bullet to estimate the standard deviation \sigma, 1000 times and taking its average? Can I do anyone of these? Is there any preference of doing a...
  6. M

    Map snps into a ref gene file using R

    I have the following data set about the snps ID POS ID 78599583 rs987435 33395779 rs345783 189807684 rs955894 33907909 rs6088791 75664046 rs11180435 218890658 rs17571465 127630276 rs17011450 90919465 rs6919430 and a gene...
  7. D

    Compliling error: fPortfolio R Package

    I am trying to build an efficient frontier using fPortfolio R Package (R3.3.1 and fPortfolio 3011.81). But launching frontier=portfolioFrontier(equityR.data,spec,constraints) I will always face an error: Error in `colnames<-`(`*tmp*`, value = c("SEC_1", "SEC_2", "SEC_3", "SEC_4", ...
  8. L

    R - bootstrap confidence intervals, Create a matrix, Perform stepwise regressions

    I'm brand to statistics and taking my 1st class in almost 40 years, so I'm quite a bit behind the times. On top of all of that, I am not very computer savvy, and have very little experience using any technical functions with computers, outside of checking email. I have no programming experience...
  9. D

    How to create .tar.gz file in Rstudio-Windows

    I am trying to create an R package by using "Rstudio" for the first time ^_*. Actually, I wrote all the functions in several "R scripts" and the documentation by using roxygen2. I loaded and built (Ctrl+Shift+B) and it works "DONE(packageName)", I used roxgen2::roxgenise() for documentation...
  10. K

    Looking for contributors to join our team

    Hi everyone, I hope you find this post useful. Recently we developed a website DataScience+ to share some R tutorials with other R users. We are a small team and we are looking for other R enthusiast to join our team. If you use R for data science and research please sent us an email we...
  11. S

    Finding the best way to prepare raster image for a function

    Hello all, I'd like to ask what is the best way to prepare big raster images in R to run a function on whole images. I have 30 tiff images with the size of ncol=12648, nrow=4144. I read them in R separately. Then, I stacked all 30 raster images into one. Now the image size is ncol=12648...
  12. M

    Time series analysis with external events in R

    I have a time series of daily website visitors and several different marketing events (some continuous for several days). I would like to determine what impact did those marketing events have on website visitors dynamics. What approach would you suggest in terms of analysis ? I'm working with R...
  13. E

    hclust function-cluster analysis-text/document-function creation

    Hi guys Im working on a text mining/clustering project and am trying to create a table which contains number of clusters as rows and 6 columns representing the following 6 metrics: max.diameter, min.separation, average.within,average.between,avg.silwidth,dunn. I need to create the tables for 3...
  14. C

    computing mode of a density function in R

    I want to compute mode of the following distribution in R. But i think my procedure is not correct to compute it . f <- function(x)(3/7)*x^2 #1<x<2 x=seq(1,2,length=5000) y = f(x) d = data.frame(x,y) d$x[d$y==max(d$y)]
  15. C

    chi-squared hypothesis test of homogeneity using R

    A survey of drivers was taken to see if they had been in an accident during the previous year, and if so was it a minor or major accident. The results are tabulated by age group: \begin{array}{c|lcr} \text{Age} & \text{None} & \text{Minor} & \text{Major} \\ \hline \text{under} 18 &67 &...
  16. M

    Developing a new R package

    I want to develop a new R package , I'm not that amazing in programming thus if you can help me we can publish a paper in the Journal of Statistical Software together, Any one interested? for more details please send me your email and I will contact you. Marwah Soliman, PhD student at...
  17. M

    write a table

    Hi everyone, I have a small question how to write the following table in R: \begin{tabular}{|l|cr|} \hline Time & Year & intensity \\ \hline 5min & 2 & 99.26252 \\ 10min & 2 & 71.37107 \\ 15min & 2 & 61.3334 \\ 5min & 5 & 130.77527\\ 10min& 5& 94.88575\\ 15min& 5& 78.30003\\...
  18. M

    selecting data with certain postal code

    HI guys I have the following Data" for example" called ON: \begin{tabular}{|l|cr|} \hline Postal & Claims \\ \hline P0x123 & 180 \\ N0x123 & 169 \\ N0h245 & 125 \\ \hline \end{tabular} I want to select the data that have the postal code starts with "N" --I have a massive data so I want to...
  19. M

    Data

    I have the following data: GRID1992040100 PCP 2.98E-02 2.35E-20 1.86E-02 2.35E-20 GRID1992040200 PCP 3.64E+01 9.65E+00 3.25E+00 1.14E+00 GRID1992040300 PCP 3.27E+00 1.10E+01 8.82E-19 1.77E+00 GRID1992040400 PCP 1.88E-18 3.44E-02 1.76E+00 1.47E-02...
  20. M

    Data

    I have the following data set GRID2002070800 PCP 1 2 2 0 2 7.273357510566711e-01 9.982312321662903e-01 2.894499301910400e-01 6.475514769554138e-01 GRID2002070900 PCP 1 2 2 0 2 1.724110126495361e+00...