longitudinal analysis help

Hi there
Im struggling to figure out which statistical test i need to use on my data and then also how to carry this out in R or SPSS....

I have a time-series of measurements of pollutant levels on leaves from 4 different plant species that were re-sampled twelve times over a period of 5 weeks. This occured at three sampling locations.

So therefore i have 4 measurements (average of 2 plant samples for each species) at 3 locations on 12 separate occasions.
Graphs of the results show clear species and site differences in pollutant levels but i just need to show these differences statistically now.

My research so far has pointed me towards longitudinal analysis with perhaps an investigation of how the pollutant levels have changed from the baseline (the first measurement when the 'clean' plants were placed in each of the 3 polluted sites), but im really not sure how to do this! A book im reading mentions constructing matrices of correlation and covariance but this is too confusing for me.

Ive also considered derived attribute analysis - perhaps looking at the differences in gradient of slope of regression lines when a regression of pollutant level on time is performed. But this feels a little basic and clumsy.

Any help would be greatly appreciated!



No cake for spunky
What are you trying to show. If you think some type of intervention is occuring (that pollution is growing as result of some event) than within subject ANOVA is a way to go. If you are looking for trends over time than ARIMA might be best (but it is not simple particularly for beginners). Its a specialized form of time series. If you think some event occured than interrupted time series is of value.

The point is you need to decide what you think is occuring, and what you want to know, to answer this question. There are many other options than what I suggested.
Its a fairly simple process that is happening...

The 4 plant species are accumulating pollutants on their leaves at a different rate (based on physiological differences) and there is a difference in ambient pollutant levels at the three sites which determines the rate of increase in pollutant levels also.

So there is a steady increase in pollution levels on the leaves over the time period and this increase has different gradients depending on species and site.

I didnt think ANOVA would be useful as i only have 2 samples of each plant (at each site) at each of the 12 sampling times.
Last edited:


No cake for spunky
If you mean your sample size is too small then that would be a problem for any statistical method. It is common in ANOVA to have no replicates in a cell


TS Contributor
Maybe give us an example of your dataset (or a simulated one) so that we known the kind of between and withing replication you are dealing with?

So these graphs show the time series of pollutant levels against day of trial for two of the species investigated. There is a clear difference between the sites which i want to explain statistically. Each data point is an average of two samples.

When you plot the data by site you can also see a difference between the species which i also want to explain statistically.