This post is carrying on from the previous one, examining a model of clone libraries to work out which analysis we can rely on to give us an accurate idea of the microbial community that was sampled.
To produce these graphs, I ran the same simulation as before (create a virtual microbial community with 2 variables, number of OTUs and Growth Factor; create a virtual clone library; randomly sample the clone library and finally 'sequence' the samples), but this time each model was sampled 100 times. The results for various parameters and analyses are presented below in box and whisker plots. As a reminder, here are the abundance charts for models 1-5.
Figure 2 - Chao1 Richness Estimator
The Chao1 Richness Estimator tries to give us an idea of how many OTUs were present in our original community based on the sample (there's a full explanation in a previous post)
Figure 3 - Coverage
Coverage tries to give us an idea of how well the community has been sampled.
And as you would expect, it goes down as the number of OTUs and the evenness in the model increases. We'll come back to the coverage later.
Figures 4 & 5 - Shannon Index and Simpson Index of Diversity
These two plots give quite a nice illustration of how the Shannon Index and Simpson Index of Diversity (SIoD) take into account both the species diversity, but also the evenness. As you can see, both indices increase as the number of OTUs increases, but then begin to decrease in less even communities.
Can We Use Coverage To Judge Accuracy?
One of the questions that the previous post threw up was whether or not we can use Coverage to judge if other indices (Chao1, Shannon and SIoD) calculated for the sampled community reflect that of the original community. Below are 3 scatter plots showing the difference between the index of the model and sampled communities plotted against the coverage of the sample. Our initial results suggested that we might be able to, however, that doesn't seem to hold up very well.
There is a moderate negative correlation (Spearman's Rank = -0.479) between the Shannon Index and the Coverage. So a low coverage is moderately correlated to a bigger difference between the Shannon Index of the model and the sample.
There is a weak negative correlation (Spearman's Rank = -0.389) between coverage and Chao1 Richness Estimator. So a low coverage would lead us to believe that the Chao1 is inaccurate, but that's not always the case.
Finally, there is nearly no correlation (Spearman's Rank = -0.225) between the difference in SIoD and coverage, so a high coverage doesn't necessarily mean an accurate SIoD.
Coverage Is Best At Indicating % of Unsampled OTUs
As you can see from the scatter plot below, coverage is best correlated (Spearman's Rank = -0.497) to the % of unsampled OTUs. So a low coverage generally means that the sample is less representative of the original community.
I think that's enough about Clone Libraries... next stop Denaturing Gradient Gel Electrophoresis... the fun never stops.
No comments:
Post a Comment