RNA-GPS predicts high-resolution RNA subcellular localization and highlights the role of splicing

  1. James Zou1,2
  1. 1Department of Computer Science, Stanford University, Stanford, California 94305, USA
  2. 2Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, California 94305, USA
  3. 3Center for Personal and Dynamic Regulomes, Stanford University School of Medicine, Stanford, California 94305, USA
  4. 4Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California 94305, USA
  1. Corresponding authors: howchang{at}stanford.edu, jamesz{at}stanford.edu

Abstract

Subcellular localization is essential to RNA biogenesis, processing, and function across the gene expression life cycle. However, the specific nucleotide sequence motifs that direct RNA localization are incompletely understood. Fortunately, new sequencing technologies have provided transcriptome-wide atlases of RNA localization, creating an opportunity to leverage computational modeling. Here we present RNA-GPS, a new machine learning model that uses nucleotide-level features to predict RNA localization across eight different subcellular locations—the first to provide such a wide range of predictions. RNA-GPS's design enables high-throughput sequence ablation and feature importance analyses to probe the sequence motifs that drive localization prediction. We find localization informative motifs to be concentrated on 3′-UTRs and scattered along the coding sequence, and motifs related to splicing to be important drivers of predicted localization, even for cytotopic distinctions for membraneless bodies within the nucleus or for organelles within the cytoplasm. Overall, our results suggest transcript splicing is one of many elements influencing RNA subcellular localization.

Keywords

  • Received November 26, 2019.
  • Accepted March 19, 2020.

This article is distributed exclusively by the RNA Society for the first 12 months after the full-issue publication date (see http://rnajournal.cshlp.org/site/misc/terms.xhtml). After 12 months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

| Table of Contents