Theoretical analysis of mutation hotspots and their DNA sequence context specificity

Igor B. Rogozin, Youri I. Pavlov

Research output: Contribution to journalReview articlepeer-review

142 Scopus citations


Mutation frequencies vary significantly along nucleotide sequences such that mutations often concentrate at certain positions called hotspots. Mutation hotspots in DNA reflect intrinsic properties of the mutation process, such as sequence specificity, that manifests itself at the level of interaction between mutagens, DNA, and the action of the repair and replication machineries. The hotspots might also reflect structural and functional features of the respective DNA sequences. When mutations in a gene are identified using a particular experimental system, resulting hotspots could reflect the properties of the gene product and the mutant selection scheme. Analysis of the nucleotide sequence context of hotspots can provide information on the molecular mechanisms of mutagenesis. However, the determinants of mutation frequency and specificity are complex, and there are many analytical methods for their study. Here we review computational approaches for analyzing mutation spectra (distribution of mutations along the target genes) that include many mutable (detectable) positions. The following methods are reviewed: derivation of a consensus sequence, application of regression approaches to correlate nucleotide sequence features with mutation frequency, mutation hotspot prediction, analysis of oligonucleotide composition of regions containing mutations, pairwise comparison of mutation spectra, analysis of multiple spectra, and analysis of "context-free" characteristics. The advantages and pitfalls of these methods are discussed and illustrated by examples from the literature. The most reliable analyses were obtained when several methods were combined and information from theoretical analysis and experimental observations was considered simultaneously. Simple, robust approaches should be used with small samples of mutations, whereas combinations of simple and complex approaches may be required for large samples. We discuss several well-documented studies where analysis of mutation spectra has substantially contributed to the current understanding of molecular mechanisms of mutagenesis. The nucleotide sequence context of mutational hotspots is a fingerprint of interactions between DNA and DNA repair, replication, and modification enzymes, and the analysis of hotspot context provides evidence of such interactions.

Original languageEnglish (US)
Pages (from-to)65-85
Number of pages21
JournalMutation Research - Reviews in Mutation Research
Issue number1
StatePublished - Sep 2003
Externally publishedYes


  • Classification analysis
  • DNA sequence context
  • Direct repeat
  • Hotspot
  • Microsatellite
  • Mutable motif
  • Mutation spectra
  • Oligonucleotides
  • Palindrome

ASJC Scopus subject areas

  • Genetics
  • Health, Toxicology and Mutagenesis


Dive into the research topics of 'Theoretical analysis of mutation hotspots and their DNA sequence context specificity'. Together they form a unique fingerprint.

Cite this