Hey, I am new to this forum and would like to say thanks in advance to anyone who replies to this message. I have tried thoroughly to discover the solution myself over the last couple of days and have become confused.
I have been trying to discover what kind of statistical technique I should use to figure out how much of a correlation there is between individual drivers and carbon production (Also known as Net Ecosystem Production or NEP) in a dataset of 500 forest sites I am to analyse. Simply plotting the x axis with carbon production in tonnes per ha and the Y axis with each individual sites particular carbon production driver, will not accurately represent whether or not less influential drivers on overall carbon production such as intensity of forest management, have a correlation to carbon production. This is due to carbon production drivers that are extremely influential such as solar radiance representing the majority of carbon production variance.
Currently I have identified that my proposed drivers behind carbon production are both discrete and continuous variables as shown below with corresponding numbers.
Proposed NEP (Carbon Production) Drivers and NEP
1. Precipitation in mm per year
2. Available water capacity in mm
3. Yearly stand net solar radiation in W.m-2
4. Soil moisture
5. Nitrogen density in g/m2
6. Species age
7. Mean annual temperature in Celsius
8. Number of Dominant Species
9. Nutrient availability (scale of 1-5)
10. Dominant species ( represents 100%)
11. Co-dominant species (represents 50% of the site)
12. Stand Management (type)
13. Net Ecosystem Productivity (NEP or Carbon Produced)
1. Ratio (continuous)
2. Ratio (continuous)
3. Ratio (continuous)
4. Ratio (continuous)
5. Ratio (continuous)
6. Ratio (continuous)
7. Interval (continuous)
8. Interval (continuous)
9. Ordinal (discrete)
10. Nominal (Discrete)
11. Nominal (Discrete)
12. Nominal (Discrete)
13. Ratio (continuous)
If I was trying to find the correlation between management type and carbon production am I right in thinking by conducting a Principle Components Analysis (PCA) I will be able to eliminate the variance caused by the more influential drivers behind carbon production such as solar radiance and precipitation? Am I also right in thinking I could do this process for all of the different factors. Basically removing the variance caused by all other NEP drivers to show if there is a more robust correlation between one known driver and NEP through a PCA.
Once the analysis has been completed I want to compare my results with lab experiments and discuss the known knowledge that these factors have in driving carbon production.
Apart from not fully understanding if PCA is the best method to use for this kind of analysis i have become even more confused after learning PCA can only be reliably used on continuous data and not discrete data. If my assumptions of the suitability of a PCA type anaylse for my desired outcome are correct can anyone possibly advise me to a similar technique for both continuous and discrete data in the same dataset. From what I have read using a polychoric principle components analysis works best for this.
If you have got this far once again thanks for reading
Advertise on Talk Stats