nature structural & molecular biology
We’d like to understand how you use our websites in order to improve them. Register your interest.
Technical Report
RNA-binding proteins
Abstract
RNA-binding sites (RBSs) can be identified by liquid chromatography and tandem mass spectrometry analyses of the protein–RNA conjugates created by crosslinking, but RBS mapping remains highly challenging due to the complexity of the formed RNA adducts. Here, we introduce RBS-ID, a method that uses hydrofluoride to fully cleave RNA into mono-nucleosides, thereby minimizing the search space to drastically enhance coverage and to reach single amino acid resolution. Moreover, the simple mono-nucleoside adducts offer a confident and quantitative measure of direct RNA–protein interaction. Using RBS-ID, we profiled ~2,000 human RBSs and probed Streptococcus pyogenes Cas9 to discover residues important for genome editing.
Get full journal access for 1 year
$225.00
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
from$8.99
All prices are NET prices.
Additional access options:
Fig. 1: RBS-ID robustly identifies RBSs at the proteome level.
Fig. 2: Domain and site-level analysis of identified RBSs.
Fig. 3: RBS-ID identifies functionally important residues in spCas9.
Data availability
MS data have been deposited at the ProteomeXchange Consortium ( http://proteomecentral.proteomexchange.org ) via the PRIDE partner repository with dataset identifier PXD016254 . Source data for Figs. 2a , 2d , 2e , 3c and 3d and Extended Data Figs. 9a , 9b and 10a are available with the paper online.
References
1.
Castello, A. et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 1393–1406 (2012).
CAS Article Google Scholar
2.
Baltz, A. G. et al. The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol. Cell 46, 674–690 (2012).
CAS Article Google Scholar
3.
Leitner, A., Dorn, G. & Allain, F. H. T. Combining mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy for integrative structural biology of protein–RNA complexes. Cold Spring Harb. Perspect. Biol. 11, a032359 (2019).
CAS Article Google Scholar
4.
Castello, A. et al. Comprehensive identification of RNA-binding domains in human cells. Mol. Cell 63, 696–710 (2016).
CAS Article Google Scholar
5.
He, C. et al. High-resolution mapping of RNA-binding regions in the nuclear proteome of embryonic stem cells. Mol. Cell 64, 416–430 (2016).
CAS Article Google Scholar
6.
Kramer, K. et al. Photo-cross-linking and high-resolution mass spectrometry for assignment of RNA-binding sites in RNA-binding proteins. Nat. Methods 11, 1064–1070 (2014).
CAS Article Google Scholar
7.
Panhale, A. et al. CAPRI enables comparison of evolutionarily conserved RNA interacting regions. Nat. Commun. 10, 2682 (2019).
Article Google Scholar
8.
Shchepachev, V. et al. Defining the RNA interactome by total RNA-associated protein purification. Mol. Syst. Biol. 15, e8689 (2019).
Article Google Scholar
9.
Jeong, K., Kim, S. & Bandeira, N. False discovery rates in spectral identification. BMC Bioinformatics 13(Suppl. 16), S2 (2012).
CAS Article Google Scholar
10.
Bogdanow, B., Zauber, H. & Selbach, M. Systematic errors in peptide and protein identification and quantification by modified peptides. Mol. Cell Proteom. 15, 2791–2801 (2016).
CAS Article Google Scholar
11.
Trendel, J. et al. The human RNA-binding proteome and its dynamics during translational arrest. Cell 176, 391–403 (2019).
CAS Article Google Scholar
12.
Crean, C., Uvaydov, Y., Geacintov, N. E. & Shafirovich, V. Oxidation of single-stranded oligonucleotides by carbonate radical anions: generating intrastrand cross-links between guanine and thymine bases separated by cytosines. Nucleic Acids Res. 36, 742–755 (2008).
CAS Article Google Scholar
13.
Woo, E. M., Fenyo, D., Kwok, B. H., Funabiki, H. & Chait, B. T. Efficient identification of phosphorylation by mass spectrometric phosphopeptide fingerprinting. Anal. Chem. 80, 2419–2425 (2008).
CAS Article Google Scholar
14.
Na, S., Bandeira, N. & Paek, E. Fast multi-blind modification search through tandem mass spectrometry. Mol. Cell Proteom. 11, 010199 (2012).
Article Google Scholar
15.
Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D. & Nesvizhskii, A. I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat. Methods 14, 513–520 (2017).
CAS Article Google Scholar
16.
Kim, S. & Pevzner, P. A. MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 5, 5277 (2014).
CAS Article Google Scholar
17.
Edwards, N. J. PepArML: a meta-search peptide identification platform for tandem mass spectra. Curr. Protoc. Bioinformatics 44, 13.23.1–13.23.23 (2013).
Article Google Scholar
18.
Chalkley, R. J. & Clauser, K. R. Modification site localization scoring: strategies and performance. Mol. Cell Proteom. 11, 3–14 (2012).
CAS Article Google Scholar
19.
Chang, C. et al. PANDA: a comprehensive and flexible tool for quantitative proteomics data analysis. Bioinformatics 35, 898–900 (2019).
UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2019).
21.
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
CAS Article Google Scholar
22.
Schafer, I. B. et al. Molecular basis for poly(A) RNP architecture and recognition by the Pan2-Pan3 deadenylase. Cell 177, 1619–1631 (2019).
Article Google Scholar
23.
Kuhn, U. & Pieler, T. Xenopus poly(A) binding protein: functional domains in RNA binding and protein–protein interaction. J. Mol. Biol. 256, 20–30 (1996).
CAS Article Google Scholar
24.
Hawley, B. R., Lu, W. T., Wilczynska, A. & Bushell, M. The emerging role of RNAs in DNA damage repair. Cell Death Differ. 24, 580–587 (2017).
CAS Article Google Scholar
25.
Shetlar, M. D., Carbone, J., Steady, E. & Hom, K. Photochemical addition of amino acids and peptides to polyuridylic acid. Photochem. Photobiol. 39, 141–144 (1984).
CAS Article Google Scholar
26.
Hornbeck, P. V. et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 43, D512–D520 (2015).
CAS Article Google Scholar
27.
Yoon, J. H. et al. Tyrosine phosphorylation of HuR by JAK3 triggers dissociation and degradation of HuR target mRNAs. Nucleic Acids Res. 42, 1196–1208 (2014).
CAS Article Google Scholar
28.
Jiang, F., Zhou, K., Ma, L., Gressel, S. & Doudna, J. A. A Cas9–guide RNA complex preorganized for target DNA recognition. Science 348, 1477–1481 (2015).
CAS Article Google Scholar
29.
Anders, C., Niewoehner, O., Duerst, A. & Jinek, M. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513, 569–573 (2014).
30.
Ran, F. A. et al. Genome engineering using the CRISPR-Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).
CAS Article Google Scholar
31.
Leonetti, M. D., Sekine, S., Kamiyama, D., Weissman, J. S. & Huang, B. A scalable strategy for high-throughput GFP tagging of endogenous human proteins. Proc. Natl Acad. Sci. USA 113, E3501–3508 (2016).
CAS Article Google Scholar
32.
He, L., Diedrich, J., Chu, Y. Y. & Yates, J. R. III. Extracting accurate precursor information for tandem mass spectra by RawConverter. Anal. Chem. 87, 11361–11367 (2015).
CAS Article Google Scholar
33.
Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).
CAS Article Google Scholar
34.
Vizcaino, J. A. et al. The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res. 41, D1063–D1069 (2013).
Download references
Acknowledgements
We thank S. Shin, K. Baeg and S. Lee for insightful comments and discussion. We are also grateful to J. Kim, J. Yang, D. Choi and E. Kim for technical help, and all members of our laboratories for helpful discussion. We thank J. S. Kim (Seoul National University), Pacific Northwest National Laboratory and the OMICS.PNL.GOV for providing valuable plasmid and software. This work was supported by IBS-R008-D1 of the Institute for Basic Science from the Ministry of Science and ICT of Korea (J.W.B., S.-C.K., Y.N., V.N.K. and J.-S.K.) and BK21 Research Fellowships (J.W.B.) from the Ministry of Education, Science and Technology of Korea.
Author information
Center for RNA Research, Institute for Basic Science, Seoul, Korea
Jong Woo Bae
School of Biological Sciences, Seoul National University, Seoul, Korea
Jong Woo Bae
Google Scholar
Contributions
J.W.B., V.N.K. and J.-S.K. conceived the project and designed the experiments. J.W.B. developed the protocol and performed all biochemical experiments with the support of S.C.K. and Y.N. J.W.B. generated and analyzed all LC-MS/MS datasets. J.W.B., V.N.K. and J.-S.K. wrote the manuscript.
Corresponding authors
The authors declare no competing interests.
Additional information
Peer review information Peer reviewer reports are available. Anke Sparmann was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 HF treatment on RNA and peptides.
a, Schematic illustration of HF treatment on RNA (left) and the resulting products (right). HF-mediated cleavage sites are highlighted. b, UV-absorbance chromatogram of 20-mer RNA (AUGCAUGCAUGCAUGCAUGC) digested with HF (black solid and dashed lines from duplicate experiments), merged with those of undigested RNA (gray) and reference chemicals (colored, solid). Peaks of reference chemicals were re-sized for better visualization. The black solid line is the source data for Fig. 1b . c, UV-absorbance chromatogram of HeLaT total RNA digested with HF (black, solid or dashed lines from duplicate experiments), merged with those of reference chemicals (colored, solid). Peaks of reference chemicals were re-sized for better visualization. d, Proportion of semi-tryptic PSM identified from HeLaT digest peptides upon HF treatment, compared to negative control treated with H2O. Two-sided unpaired Student’s T-test was performed between n = 3 biologically independent samples (H2O vs. HF), assuming equal variance. The mean values were depicted with error bars that indicate standard deviation between replicates. P-value, rounded up to the fourth decimal point: 0.0258. e, Proportion of identified PSM (below PSM-level FDR = 0.01) upon HF treatment, compared to H2O treatment. Two-sided unpaired Student’s T-test was performed as described above. The mean values were depicted with error bars that indicate standard deviation between replicates. P-value, rounded up to the fourth decimal point: 0.1359.
Extended Data Fig. 2 Open search on total RNA-RBS and mRNA-RBS using MODa and MSFragger.
a, MODa 14 search for modified mass on mRNA-RBS. The y-axis indicates mean spectral counts from duplicate experiments. b-c, MSFragger 15 search for modified mass on total RNA-RBS (b) and mRNA-RBS (c). d, Mean percentage of Uracil modification over Uridine modification, a highly likely in-source fragmentation product, between replicate experiments, calculated from MSFragger search results as (#PSM with Uracil adduct: 112 & 55 to account for Cys)/(#PSM with Uracil or Uridine adduct: 244, 112 & 187, 55 to account for Cys). The error bars indicate standard deviation. Free Cys was carbamidomethylated, so the corresponding adduct mass was used as a fixed modification. Owing to mutually exclusive U-crosslinking and carbamidomethylation on Cys, the observed conjugate mass of uridine on Cys (187) was smaller than that of other amino acids (244) by the difference of mass of carbamidomethyl group (57). Thus, modification mass on Cys was corrected by adding the mass of carbamidomethyl group. The percentages were rounded up to the second decimal point. e, Comparison of closed search results on total RNA and mRNA RBS-ID experiments, allowing up to one or two modifications per peptide. Modification-specific peptide-level FDR was set to 0.01. PSM counts for peptides with two modifications were depicted.
Extended Data Fig. 3 Comparison between RBS-ID and previous RBS- or RBD-profiling studies.
a, Comparison of the position of RBS in peptides shared between RBS-ID and RNPxl Venn diagram (left) shows the overlap between the peptides with uniquely localized RBS in RBS-ID (1,972) and those in human proteins of RNPxl (29) 6 . Please note that most peptides from RNPxl are also detected by RBS-ID, demonstrating the comprehensiveness of our method. The Pie chart (right) displays that among 25 common peptides, 24 peptides show consistent localization of RBS between the datasets, indicating the accuracy of the methods. b, Relative position of the peptides identified as ‘RBDpep’ 4 . X-axis shows the position of the terminus of peptides relative to RBSs identified by RBS-ID (n = 1,478). c, Relative position of the peptides identified as ‘XL-peptide’ 7 . X-axis shows the position of the terminus of peptides relative to RBSs identified by RBS-ID (n = 869).
Extended Data Fig. 5 RBS-identified protein groups and regions.
a, Top 5 GO terms associated with proteins that are not annotated as RBPs (MF: molecular function, BP: biological process, CC: Cellular component, 5 each) 20 , 21 . b-c, Top 5 GO terms associated with proteins whose RBSs are identified exclusively in total RNA enrichment but not in poly(A) + RNA enrichment (b) or in poly(A) + RNA enrichment but not in total RNA enrichment (c) (MF, BP, CC, 5 each).
Extended Data Fig. 6 Examples of RBS-identified proteins.
a, Example of RBSs identified in distant primary sequence positions (PABPC1). Sequence homology of amino acids at -5 to +5 positions from Y194, Y297, and Y364 compared to that of Y222, F325, and Y393 in yeast Pab1 are described, respectively. Identical amino acids in the same positions are bold-faced. b, Partial structure of yeast Pab1 bound to poly(A) RNA (PDB 6R5K 22 ). Y222, F325, and Y393 are indicated. c-h, Examples of RBSs identified in regions that are not annotated as RBDs. RBSs identified in NSUN5 (c), RTCA (d), APOBEC3C (e), TRIM25 (f), SERBP1 (g), and HNRNPA1 (h) are depicted. RBSs with high spectral counts are indicated.
Extended Data Fig. 7 Purified spCas9 protein and template DNA for sgRNA synthesis.
a, Purified His6-HA-NLS-TEV-spCas9 on SDS-PAGE gel stained with Coomassie G25. b, Template DNA prepared for T7 in vitro transcription of anti-CBX1 sgRNA.