subset

1. Subgroup Statistical Testing

I am looking at the my company's termination data and comparing it to the data that was taken on the termination survey (30% response rate) and am trying to determine how to decide if the subset (survey data) is statistically a good representation for the overall terminated population? Is it...
2. How to create a subset with a normal distribution

Hello, I have a presumably easy question for you, but I am a real newbie in statistics, so please be patient. I have a data set containing N values that more or less follows a normal distribution. I need to select N/10 values among them, creating a subset with the same distribution. How can I...
3. Subset based on missing in differnet variables

Suppose I have credit ratings for 5 years (credit1, credit2, credit3, credit4, credit5). I only want to keep observations which have at least three non-missing credit scores of any combination. Is there a more efficient way to do this than using rbind? Thank you!
4. subset data

why can't I subset data that have values from -9998 to 1000. I want to subset the data with only values greater than 0. I used the following code: ECC1<-subset(ECC, value > 0, select = c (sensor, value)) The error says: In Ops.factor(value, 0): > not meaningful for factor. Any clue...
5. Scoring a subset in SPSS?

Hi! I'm trying to apply an existing XML model to my SPSS dataset. However, I only want to apply the model to a subset of my data -- is it possible to create some kind of "filter" or "selection variable" ... so that the regression will only be applied to a subset of the data? For instance...
6. Create simple function to average column over different ranges of rows

I've learned how to subset a group of rows I am interested in from a matrix, and average those values. The data set shows the percent of light reflected at many different wavelengths (rows). If I want to know the average amount of light reflected at ~900 nm, I use the following code to...
7. Averaging rows from a matrix

I'm fairly new to using R, and wonder if someone could give me some help subsetting large data matrices in specific ways. I've attached a data file. It measures the percent of light reflected off of plants at 2048 different wavelengths. The first column is wavelength, which ranges from 339.99...
8. Comparison of average values of data sets

Hi, I want to know if mean of A>B and A1, A2 are the two subsets of A. Similarily, B1, B2 are two subsets of B. Is the condition A1<B1 and A2<B2 mathematically possible????? As I think it is not. I think that either both or at least one subset must have shown the same relationship in their...
9. Making Seasonal Data form Monthly

I've got a climate data set spanning a number of years in monthly intervals, but I'm only interested in what happens in winter months - October through to March. I've narrowed my data set to only include these months, but now I need to group them into yearly winters eg. winter of 2006-2007...
10. Probability Distribution of a Subset

I was wondering, if you have a set with data that follows a certain probability distribution, will the subset have that same distribution? Lately it has come up for me in several questions, one being: Say that you have a pool of applicants for a job and that pool has a certain diversity. If...
11. Random Resampling from Nest Data

First off, thanks for your help if you have any comments, and I apologize if there is a post about this somewhere (I was unable to find one). So, here is my problem. I need to randomly draw samples from my dataset of grouped data. I have observations in separate rows. Each participant belongs...
12. iterative data subsetting

I have 15 data sets that I want to take specific observations from and create a new data structure with them. Rather than doing all of this by hand, I am trying to make a function that will do it for me. This is what I have thus far: function(){ coding <- 7 for (i in 7:99){...
13. Error detection when using package "plyr"

Problem: I'm running a custom function on subsets of a data.frame using the dlply function in Wickham's package "plyr." It threw an error related to the ID variable (on which the subsetting is based), and I've been trying to figure out which of the IDs is causing the issue. I added the...