PRIMER- MDS-shows identical biological community samples equally different, WHY?

#1
Hi,
I have a large data set of marine intertidal organism abundance (measured as percent cover. I also have environmental data of substrate type, and beach exposure to wind/waves. The research question is “Do unconsolidated beaches (eg. sand or gravel, not bedrock) have different communities depending on their exposure level?”

The data set has a lot of zeros. One third of all sites were completely void of life. Also, many quadrats were mostly bare, with only a few pieces of algae or a few barnacles. A few sites are “lush” with life (large kelps and associated invertebrates)
I am using PRIMER v6 to create MDS plots of the biological community data.
My steps are as follows
- Pre-treatment: square root transform (overall)
- Resemblance> analyse between samples> Bray- Curtis similarity (did not include a dummy variable)
o All the ‘BARE’ sites have 100% similarity in my resemblance matrix (Bray-Curtis) as expected.
- MDS> restarts 25, stress 0.01, Kruskal fit scheme 1
o The resulting MDS shows bare sites (all zeros) as highly spread out (starburst shape) and all sites with species present clumped together in the very middle. When I zoom in on the middle the sites are not directly overlapped which I would expect as they have different species.
o The starburst is unexpected- when the same data run through R in the equivalent process by my colleague she gets a plot with two highly separated clusters of points: Bare and not bare. Zooming in on the bare sites shows that they are completely overlapped with each other, while the not bare sites are spread out within their cluster.
To check that there is not something ‘wrong’ in the data and to better understand primer I created a test data matrix with 4 identical samples having a value=1 for all variables.
S1: 1, 1, 1, 1, 1, 1 / S2: 1, 1, 1, 1, 1, 1 / S3: 1, 1, 1, 1, 1, 1 / S4: 1, 1, 1, 1, 1, 1
I then created a resemblance matrix from this (same settings as before)- all similarities are 100 as expected.
The MDS plot shows these samples spread out instead of right on top of one another as expected. Does anyone know why?

I then changed the data to:
S1: 5, 5, 5, 5, 5, 5 / S2: 5, 5, 5, 5, 5, 5 / S3: 1, 1, 1, 1, 1, 1 / S4: 1, 1, 1, 1, 1, 1
I expect S1 and S2 to fall exactly on top of one another in the MDS and S3 and S4 to also be identical but instead all 4 samples spread out with S1 and S2 grouped to one side and S3 and S4 to group to the other. The resemblance matrix for all these data seems correct with 100’s between S1 and S2, and S3 and S4 respectively and 33.3 between non-identical samples.

I also re- ran the test data with a dummy variable option selected and still the identical sites do not match up exactly in the MDS.

I have discussed the issue extensively with our lead researcher and my colleagues (no one is a Primer expert) and our expectations for how the MDS should look are consistent.

My conclusion is that there is something in the MDS step that I do not understand and possibly an option in Primer that I need to check off to resolve this.
Does anyone know what option in Primer could fix this issue? Can you explain how Primer is treating the data?


Thank you for your time,
Stefania
 

bugman

Super Moderator
#2
Please post the mds plot and the shepads plot and a sample of the rembelance matrix and if you dont mind a subset of the raw data and ill take look for you.
 
Last edited:
#4
Thank you for your responses- I also contacted the program writers and got my solution.
I have attached a subset of the data anyways for your interest (pre-solution). Looking at the shepard diagram (Graph 50) I can see that it is unusual as well.

(I do not have Primer 7 and I don't think I have the PERMOVA add on)

Solution:
I needed to include a dummy variable ( =1, or larger if necessary), because Bray curtis in Primer treats zeros as disimilar and two all bare sites would have an undefined similarity otherwise. Also I decreased the minimum stress from 0.01 to 0.0001. The MDS process was probably stopping too soon and not converging on the 'best' solution.

Thank you,

Stefania