large data sets

  1. R

    I have a large distribution of paired percentages. Is there a best way, or any way, to test for sig. different pairs?

    These percentages are based on RNA sequencing count data for which replicate values have been merged and transformed to these percentages (which represent average lengths, not counts). But since the replicate data has been lost in the transformation to %, I'm not sure if I can still test for...
  2. B

    Syntax error: Error in sqliteExecStatement

    Hi guys, I have the following problem: I am trying to do a multinomial logistic regression on a very large dataset. My dataset has 6 columns (5 independent variables and 1 dependent variable) and 176,483 rows (observations). The function in R which does exactly what I need is...
  3. E

    Removing duplicates in SPSS when three of the columns match each other

    I have a large dataset in SPSS with approximately 60 columns and about 100,000 rows. I would like to remove cases where the values in three columns exactly match between two different cases. I have provided an example below to illustrate my request. Original Dataset: 03/29/2012 573033...
  4. N

    renaming all variables in a large dataset

    Hello, I have 3 large datasets with over 800 variables that I need to merge. Problem is some of the datasets have the same variable names so I need to rename all the variables in each data set prior to merging. The renaming would be systematic as I would like all the variables in dataset1 to...
  5. A

    Poisson distribution - goodness of fit

    I want to test goodness of fit for poisson distribution for a very large data set (~millions of records). Can anybody suggest a method? All the methods that are available are for small data sets.
  6. M

    KS test with large data set

    Hello, I'm trying to use a ks test to determine whether two data sets differ. They are large and of slightly unequal sizes (x has 37345 samples and y 36743 samples). I'm aware that using a ks test with such large data sets will produce a significant result even if the two aren't particularly...