large data sets

  1. B

    Syntax error: Error in sqliteExecStatement

    Hi guys, I have the following problem: I am trying to do a multinomial logistic regression on a very large dataset. My dataset has 6 columns (5 independent variables and 1 dependent variable) and 176,483 rows (observations). The function in R which does exactly what I need is...
  2. E

    Removing duplicates in SPSS when three of the columns match each other

    I have a large dataset in SPSS with approximately 60 columns and about 100,000 rows. I would like to remove cases where the values in three columns exactly match between two different cases. I have provided an example below to illustrate my request. Original Dataset: 03/29/2012 573033...
  3. N

    renaming all variables in a large dataset

    Hello, I have 3 large datasets with over 800 variables that I need to merge. Problem is some of the datasets have the same variable names so I need to rename all the variables in each data set prior to merging. The renaming would be systematic as I would like all the variables in dataset1 to...
  4. A

    Poisson distribution - goodness of fit

    I want to test goodness of fit for poisson distribution for a very large data set (~millions of records). Can anybody suggest a method? All the methods that are available are for small data sets.
  5. M

    KS test with large data set

    Hello, I'm trying to use a ks test to determine whether two data sets differ. They are large and of slightly unequal sizes (x has 37345 samples and y 36743 samples). I'm aware that using a ks test with such large data sets will produce a significant result even if the two aren't particularly...