PCR artefacts are DNA sequences produced by errors in the PCR process. Some only involve changing one base pair, but some are severe enough to increase the diversity in subsequent analysis of the amplified DNA. It's important to take steps to minimise these errors and identify sequences that might be the result of PCR errors rather than actual bacteria.
Errors in Replication
The amplification of DNA is a laboratory imitation of DNA replication, and so is subject to the same errors. Since DNA polymerase is not 100% accurate, point mutations (where the incorrect base is added) and deletions can occur which will alter the replicated sequence from its original template. This error will then be amplified and may appear in results as a different sequence, especially when using techniques which can identify single nucleotide differences between sequences. The observed error rate for Taq DNA polymerase during PCR depends on the reaction conditions, and varies between one error per 290 nucleotides to one error per 5411 nucleotides (1). The error rate also differs between DNA polymerases, for example using Pfu instead of Taq DNA polymerase leads to a 10-fold improvement in the error rate (2). Reading errors will also increase with the number of PCR cycles, so it is worth keeping the cycle number to a minimum.
PCR is Blind
When someone describes PCR and how it works, it sounds like a very orderly affair. The DNA strands nicely denature, the primers form a queue and bind, replication occurs and then the corresponding DNA strands join back together again, ready for the next cycle. This makes for a nice explanation, but of course it's not like that in reality. In reality it's a messy, messy process. There's a whole load of molecules bumping around in your reaction vessel. DNA spends most of its time in a double stranded form. Double stranded DNA is a very stable structure and molecules like to be stable. A single stranded DNA molecule is the neediest molecule you'll ever find, it just wants stability! Surely this is something we can all relate to. If the perfect complementary strand isn't nearby a similar sequence will do. Equally, your PCR machine won't wait patiently for all of the DNA polymerases to finish replicating their strands of DNA. If they're not done by the time the temperature changes, then it's tough titties and you end up with a partial sequence floating around. This can lead to some really funny DNA sequences cropping up when you sequence everything.
Let's imagine we have two 16S rRNA genes in our PCR, A and B:
Heteroduplexes
Errors in Replication
The amplification of DNA is a laboratory imitation of DNA replication, and so is subject to the same errors. Since DNA polymerase is not 100% accurate, point mutations (where the incorrect base is added) and deletions can occur which will alter the replicated sequence from its original template. This error will then be amplified and may appear in results as a different sequence, especially when using techniques which can identify single nucleotide differences between sequences. The observed error rate for Taq DNA polymerase during PCR depends on the reaction conditions, and varies between one error per 290 nucleotides to one error per 5411 nucleotides (1). The error rate also differs between DNA polymerases, for example using Pfu instead of Taq DNA polymerase leads to a 10-fold improvement in the error rate (2). Reading errors will also increase with the number of PCR cycles, so it is worth keeping the cycle number to a minimum.
PCR is Blind
When someone describes PCR and how it works, it sounds like a very orderly affair. The DNA strands nicely denature, the primers form a queue and bind, replication occurs and then the corresponding DNA strands join back together again, ready for the next cycle. This makes for a nice explanation, but of course it's not like that in reality. In reality it's a messy, messy process. There's a whole load of molecules bumping around in your reaction vessel. DNA spends most of its time in a double stranded form. Double stranded DNA is a very stable structure and molecules like to be stable. A single stranded DNA molecule is the neediest molecule you'll ever find, it just wants stability! Surely this is something we can all relate to. If the perfect complementary strand isn't nearby a similar sequence will do. Equally, your PCR machine won't wait patiently for all of the DNA polymerases to finish replicating their strands of DNA. If they're not done by the time the temperature changes, then it's tough titties and you end up with a partial sequence floating around. This can lead to some really funny DNA sequences cropping up when you sequence everything.
Let's imagine we have two 16S rRNA genes in our PCR, A and B:
Heteroduplexes
The formation of heteroduplexes during PCR presents a problem. Heteroduplexes are double-stranded DNA molecules formed of single strands from different sources. As PCR progresses, you get more and more template DNA, but there are the same number of primers. The primer:template ratio decreases and can reach a point where primer annealing is no longer favoured. As we said earlier, DNA loves to bind to other bits of DNA. If there's not the perfect match nearby, either the complementary strand or a primer, it'll take what it can get. This leads to hybridisation of heterologous (from a different organism) template DNA and the formation of heteroduplexes. Heteroduplexes can increase the number of bands if the sampled is analysed using DGGE or TGGE, and introduce biases during the construction of clone libraries (3). Various methods for reducing heteroduplex formation have been proposed. These include the addition of more Taq polymerase after the 27th cycle, limiting the cycle number (4) and 10-fold dilution of the PCR product followed by three cycles of re-amplification (3).
Chimeras
Chimeras are more troublesome artefacts. They occur when a partial 16S rDNA fragment from the extension phase binds to a heterologous fragment during the annealing phase to form a heteroduplex. The incomplete fragment then acts as a primer for extension, creating a chimera of two 16S rRNA genes from different species. This is then amplified and can be easily confused as originating from a new, but unfortunately non-existent, bacterial species. Amplification of 16S rDNA is prone to the formation of chimeras because of the conserved regions of the gene. PCR amplification of 16S rDNA can produce between 5.4 and 8.6% chimeras (5).
The frequency of chimera formation increases with:
The frequency of chimera formation increases with:
- The availability of partial rDNA fragments (6).
- Damage to DNA by restriction enzymes, UV irradiation, sonication, depurination and rigorous cell lysis (7, 8).
- The percentage similarity between DNA templates (9).
The incidence of chimera formation can be decreased by:
- Increasing the elongation time (9, 10, 11).
- Decreasing the number of cycles and thereby limiting the opportunity for formation and amplification of chimeras (5, 6).
Chimeras can also be identified after sequencing of amplified rDNA. Chimeric sequences are difficult to distinguish from true biological sequences, however, there are several computer programs which search for and identify chimeric sequences. Chimeras can also be identified by the production of incongruant trees following phylogenetic analyses on opposite ends of the rDNA sequences (6).
PCR artefacts are not the only problems with samples amplified using PCR. Not all DNA strands in the sample are amplified to the same extent, this is called differential amplification and is covered in another post.
References
PCR artefacts are not the only problems with samples amplified using PCR. Not all DNA strands in the sample are amplified to the same extent, this is called differential amplification and is covered in another post.
References
- Eckert KA, Kunkel TA. DNA polymerase fidelity and the polymerase chain reaction. Genome Res. 1991;1(1):17–24.
- Lundberg KS, Shoemaker DD, Adams MWW, Short JM, Sorge JA, Mathur EJ. High-fidelity amplification using a thermostable DNA polymerase isolated from Pyrococcus furiosus. Gene. 1991;108(1):1–6.
- Thompson JR, Marcelino L a, Polz MF. Heteroduplexes in mixed-template amplifications: formation, consequence and elimination by “reconditioning PCR”. Nucleic Acids Res. 2002;30(9):2083–8.
- Michu E, Mráčková M, Vyskot B, Žlůvová J. Reduction of heteroduplex formation in PCR amplification. Biol Plant. 2010;54(1):173–6.
- Wintzingerode F, Göbel UB, Stackebrandt E. Determination of microbial diversity in environmental samples: pitfalls of PCR-based analysis. FEMS Microbiol Rev. 1997;21:213–29.
- Osborn M A, Smith CJ. Molecular Microbial Ecology. Vol. 51. 2009. 370 p.
- Pääbo S., Irwin D. M. Wilson A. C. DNA damage promotes jumping between templates during enzymatic amplification. J. Biol. Chem. 1990;265:4721.
- Possemiers S, Verthé K, Uyttendaele S, Verstraete W. PCR-DGGE-based quantification of stability of the microbial community in a simulator of the human intestinal microbial ecosystem. FEMS Microbiol Ecol. 2004;49(3):495–507.
- Wang GC, Wang Y. The frequency of chimeric molecules as a consequence of PCR co-amplification of 16S rRNA genes from different bacterial species. Microbiology. 1996 May;142 (Pt 5):1107-14.
- Shen J, Zhang B, Wei G, Pang X, Wei H, Li M, et al. Molecular profiling of the Clostridium leptum subgroup in human fecal microflora by PCR-denaturing gradient gel electrophoresis and clone library analysis. Appl Environ Microbiol. 2006;72(8):5232–8.
- Tourlomousis P, Kemsley EK, Ridgway KP, Toscano MJ, Humphrey TJ, Narbad A. PCR-Denaturing gradient gel electrophoresis of complex microbial communities: A two-step approach to address the effect of gel-to-gel variation and allow valid comparisons across a large dataset. Microb Ecol. 2010;59(4):776–86.