Genome-wide variation in betacoronaviruses

Katherine LaTourrette, Natalie M. Holste, Rosalba Rodriguez-Peña, Raquel Arruda Leme, Hernan Garcia-Ruiz

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


The Severe acute respiratory syndrome coronavirus (SARS-CoV) and SARS-CoV-2 originated in bats and adapted to infect humans. Several SARS-CoV-2 strains have been identified. Genetic variation is fundamental to virus evolution and, in response to selection pressure, is manifested as the emergence of new strains and species adapted to different hosts or with novel pathogenicity. The combination of variation and selection forms a genetic footprint on the genome, consisting of the preferential accumulation of mutations in particular areas. Properties of betacoronaviruses contributing to variation and the emergence of new strains and species are beginning to be elucidated. To better understand their variation, we profiled the accumulation of mutations in all species in the genus Betacoronavirus, including SARS-CoV-2 and two other species that infect humans: SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV). Variation profiles identified both genetically stable and variable areas at homologous locations across species within the genus Betacoronavirus. The S glycoprotein is the most variable part of the genome and is structurally disordered. Other variable parts include proteins 3 and 7 and ORF8, which participate in replication and suppression of antiviral defense. In contrast, replication proteins in ORF1b are the least variable. Collectively, our results show that variation and structural disorder in the S glycoprotein is a general feature of all members of the genus Betacoronavirus, including SARS-CoV-2. These findings highlight the potential for the continual emergence of new species and strains with novel biological properties and indicate that the S glycoprotein has a critical role in host adaptation. IMPORTANCE Natural infection with SARS-CoV-2 and vaccines triggers the formation of antibodies against the S glycoprotein, which are detected by antibody-based diagnostic tests. Our analysis showed that variation in the S glycoprotein is a general feature of all species in the genus Betacoronavirus, including three species that infect humans: SARS-CoV, SARS-CoV-2, and MERS-CoV. The variable nature of the S glycoprotein provides an explanation for the emergence of SARS-CoV-2, the differentiation of SARS-CoV-2 into strains, and the probability of SARS-CoV-2 repeated infections in people. Variation of the S glycoprotein also has important implications for the reliability of SARS-CoV-2 antibody-based diagnostic tests and the design and deployment of vaccines and antiviral drugs. These findings indicate that adjustments to vaccine design and deployment and to antibody-based diagnostic tests are necessary to account for S glycoprotein variation.

Original languageEnglish (US)
Article numbere00496-21
JournalJournal of virology
Issue number15
StatePublished - Aug 2021


  • COVID-19
  • Coronavirus
  • Genomic variation
  • Glycoprotein S
  • MERS-CoV
  • Protein S
  • S protein
  • SARS-CoV
  • SARS-CoV-2
  • Vaccine

ASJC Scopus subject areas

  • Microbiology
  • Immunology
  • Insect Science
  • Virology


Dive into the research topics of 'Genome-wide variation in betacoronaviruses'. Together they form a unique fingerprint.

Cite this