TY - JOUR
T1 - Theoretical analysis of mutation hotspots and their DNA sequence context specificity
AU - Rogozin, Igor B.
AU - Pavlov, Youri I.
N1 - Funding Information:
This work was partially supported by RFBR (grant nos. 96-04-49957, 99-04-49535 and 02-04-48342). We thank B.A. Rogozin, N.N. Khromov-Borisov, N.A. Kolchanov, G.V. Glazko, O.I. Sinitsina, V.V. Solovyev, V.N. Babenko and A.S. Kondrashov for helpful discussions and P.V. Shcherbakova, V.N. Babenko, E.A. Vasunina, T.A. Kunkel, W.C. Copeland and anonymous referees for helpful comments on the manuscript. Miriam Sander (Page One Editorial Services) is acknowledged for professional scientific editorial work.
PY - 2003/9
Y1 - 2003/9
N2 - Mutation frequencies vary significantly along nucleotide sequences such that mutations often concentrate at certain positions called hotspots. Mutation hotspots in DNA reflect intrinsic properties of the mutation process, such as sequence specificity, that manifests itself at the level of interaction between mutagens, DNA, and the action of the repair and replication machineries. The hotspots might also reflect structural and functional features of the respective DNA sequences. When mutations in a gene are identified using a particular experimental system, resulting hotspots could reflect the properties of the gene product and the mutant selection scheme. Analysis of the nucleotide sequence context of hotspots can provide information on the molecular mechanisms of mutagenesis. However, the determinants of mutation frequency and specificity are complex, and there are many analytical methods for their study. Here we review computational approaches for analyzing mutation spectra (distribution of mutations along the target genes) that include many mutable (detectable) positions. The following methods are reviewed: derivation of a consensus sequence, application of regression approaches to correlate nucleotide sequence features with mutation frequency, mutation hotspot prediction, analysis of oligonucleotide composition of regions containing mutations, pairwise comparison of mutation spectra, analysis of multiple spectra, and analysis of "context-free" characteristics. The advantages and pitfalls of these methods are discussed and illustrated by examples from the literature. The most reliable analyses were obtained when several methods were combined and information from theoretical analysis and experimental observations was considered simultaneously. Simple, robust approaches should be used with small samples of mutations, whereas combinations of simple and complex approaches may be required for large samples. We discuss several well-documented studies where analysis of mutation spectra has substantially contributed to the current understanding of molecular mechanisms of mutagenesis. The nucleotide sequence context of mutational hotspots is a fingerprint of interactions between DNA and DNA repair, replication, and modification enzymes, and the analysis of hotspot context provides evidence of such interactions.
AB - Mutation frequencies vary significantly along nucleotide sequences such that mutations often concentrate at certain positions called hotspots. Mutation hotspots in DNA reflect intrinsic properties of the mutation process, such as sequence specificity, that manifests itself at the level of interaction between mutagens, DNA, and the action of the repair and replication machineries. The hotspots might also reflect structural and functional features of the respective DNA sequences. When mutations in a gene are identified using a particular experimental system, resulting hotspots could reflect the properties of the gene product and the mutant selection scheme. Analysis of the nucleotide sequence context of hotspots can provide information on the molecular mechanisms of mutagenesis. However, the determinants of mutation frequency and specificity are complex, and there are many analytical methods for their study. Here we review computational approaches for analyzing mutation spectra (distribution of mutations along the target genes) that include many mutable (detectable) positions. The following methods are reviewed: derivation of a consensus sequence, application of regression approaches to correlate nucleotide sequence features with mutation frequency, mutation hotspot prediction, analysis of oligonucleotide composition of regions containing mutations, pairwise comparison of mutation spectra, analysis of multiple spectra, and analysis of "context-free" characteristics. The advantages and pitfalls of these methods are discussed and illustrated by examples from the literature. The most reliable analyses were obtained when several methods were combined and information from theoretical analysis and experimental observations was considered simultaneously. Simple, robust approaches should be used with small samples of mutations, whereas combinations of simple and complex approaches may be required for large samples. We discuss several well-documented studies where analysis of mutation spectra has substantially contributed to the current understanding of molecular mechanisms of mutagenesis. The nucleotide sequence context of mutational hotspots is a fingerprint of interactions between DNA and DNA repair, replication, and modification enzymes, and the analysis of hotspot context provides evidence of such interactions.
KW - Classification analysis
KW - DNA sequence context
KW - Direct repeat
KW - Hotspot
KW - Microsatellite
KW - Mutable motif
KW - Mutation spectra
KW - Oligonucleotides
KW - Palindrome
UR - http://www.scopus.com/inward/record.url?scp=0042847380&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0042847380&partnerID=8YFLogxK
U2 - 10.1016/S1383-5742(03)00032-2
DO - 10.1016/S1383-5742(03)00032-2
M3 - Review article
C2 - 12888108
AN - SCOPUS:0042847380
SN - 1383-5742
VL - 544
SP - 65
EP - 85
JO - Mutation Research - Reviews in Mutation Research
JF - Mutation Research - Reviews in Mutation Research
IS - 1
ER -