I'm fairly new to using R, and wonder if someone could give me some help subsetting large data matrices in specific ways.
I've attached a data file. It measures the percent of light reflected off of plants at 2048 different wavelengths. The first column is wavelength, which ranges from 339.99 to 1024.14 nm, by approximately 0.3 nm steps. The first row describes the field plot that was sampled, which are mostly numbers but sometimes characters (WB for white board, which is a control).
This sample data file has 2048 rows (wavelengths) by 10 columns (field plots), plus a column and row with labels. The real data files have several hundred columns (field plots).
I am interested in seeing how the values of certain wavelengths differ between samples (columns). For example, the parameter called the Water Index (WI) describes the drought tolerance of different plants. Water Index = reflectance at 970 nm / reflectance at 900 nm.
I would like to be able to ask R to do things like:
- Average the ____ values closest to ____ nm for each sample, and save this as a vector.
- Average the values of all rows between ____ and ____ nm, and save this as a vector.
Specifically this would be
- Average the 3 values (rows) closest to 970 nm and save this as a vector called 970_avg3.
- Average the 5 values (rows) closest to 970 nm and save this as a vector called 970_avg5.
- Average the 10 values (rows) closest to 970 nm and save this as a vector called 970_avg10.
But also I would want to be able to look at discrete ranges, like
- Average all rows that are >969 & <971 nm, and save this as a vector called 970
What's most important though, is that I can change the wavelength that it's calling for pretty easily, so instead of looking for 970 nm I could just as easily ask for 500, 700, 1025, or any other number.
Thanks advance for your help.
Sarah
I've attached a data file. It measures the percent of light reflected off of plants at 2048 different wavelengths. The first column is wavelength, which ranges from 339.99 to 1024.14 nm, by approximately 0.3 nm steps. The first row describes the field plot that was sampled, which are mostly numbers but sometimes characters (WB for white board, which is a control).
This sample data file has 2048 rows (wavelengths) by 10 columns (field plots), plus a column and row with labels. The real data files have several hundred columns (field plots).
I am interested in seeing how the values of certain wavelengths differ between samples (columns). For example, the parameter called the Water Index (WI) describes the drought tolerance of different plants. Water Index = reflectance at 970 nm / reflectance at 900 nm.
I would like to be able to ask R to do things like:
- Average the ____ values closest to ____ nm for each sample, and save this as a vector.
- Average the values of all rows between ____ and ____ nm, and save this as a vector.
Specifically this would be
- Average the 3 values (rows) closest to 970 nm and save this as a vector called 970_avg3.
- Average the 5 values (rows) closest to 970 nm and save this as a vector called 970_avg5.
- Average the 10 values (rows) closest to 970 nm and save this as a vector called 970_avg10.
But also I would want to be able to look at discrete ranges, like
- Average all rows that are >969 & <971 nm, and save this as a vector called 970
What's most important though, is that I can change the wavelength that it's calling for pretty easily, so instead of looking for 970 nm I could just as easily ask for 500, 700, 1025, or any other number.
Thanks advance for your help.
Sarah