Another lesson from unmapped reads: in-depth analysis of RNA-Seq reads from various horse tissues

Artur Gurgul, Tomasz Szmatoła, Ewa Ocłoń, Igor Jasielczuk, Ewelina Semik-Gurgul, Carrie J. Finno, Jessica L. Petersen, Rebecca Bellone, Erin N. Hales, Tomasz Ząbek, Zbigniew Arent, Małgorzata Kotula-Balak, Monika Bugno-Poniewierska

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


In recent years, a vast amount of sequencing data has been generated and large improvements have been made to reference genome sequences. Despite these advances, significant portions of reads still do not map to reference genomes and these reads have been considered as junk or artificial sequences. Recent studies have shown that these reads can be useful, e.g., for refining reference genomes or detecting contaminating microorganisms present in the analyzed biological samples. A special case of this is RNA sequencing (RNA-Seq) reads that come from tissue transcriptomes. Unmapped reads from RNA-Seq have received much less attention than those from whole-genome sequencing. In particular, in the horse, an analysis of unmapped RNA reads has not been performed yet. Thus, in this study, we analyzed the unmapped reads originating from the RNA-Seq performed through the Functional Annotation of Animal Genomes (FAANG) project in the horse, using eight different tissues from two mares. We demonstrated that unmapped reads from RNA-Seq could be easily assembled into transcripts relating to many important genes present in the sequences of other mammals. Large portions of these transcripts did not have coding potential and, thus, can be considered as non-coding RNA. Moreover, reads that were not mapped to the reference genome but aligned to the entries in NCBI database of horse proteins were enriched for biological processes that largely correspond to the functions of organ from which RNA was isolated and thus are presumably true transcripts of genes associated with cell metabolism in those tissues. In addition, a portion of reads aligned to the common pathogenic or neutral microbiota, of which the most common was Brucella spp. These data suggest that unmapped reads can be an important target for in-depth analysis that may substantially enrich results of initial RNA-Seq experiments for various tissues and organs.

Original languageEnglish (US)
Pages (from-to)571-581
Number of pages11
JournalJournal of Applied Genetics
Issue number3
StatePublished - Sep 2022


  • Equine
  • Genome assembly
  • Misassembled
  • Transcriptome

ASJC Scopus subject areas

  • Genetics


Dive into the research topics of 'Another lesson from unmapped reads: in-depth analysis of RNA-Seq reads from various horse tissues'. Together they form a unique fingerprint.

Cite this