Individually double minimum-distance definition of protein–RNA binding residues and application to structure-based prediction

Hu, Wen; Qin, Liu; Li, Menglong; Pu, Xuemei; Guo, Yanzhi

doi:10.1007/s10822-018-0177-z

Individually double minimum-distance definition of protein–RNA binding residues and application to structure-based prediction

Published: 26 November 2018

Volume 32, pages 1363–1373, (2018)
Cite this article

Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Wen Hu¹,
Liu Qin¹,
Menglong Li¹,
Xuemei Pu¹ &
…
Yanzhi Guo¹

369 Accesses
8 Citations
Explore all metrics

Abstract

Identifying protein–RNA binding residues is essential for understanding the mechanism of protein–RNA interactions. So far, rigid distance thresholds are commonly used to define protein–RNA binding residues. However, after investigating 182 non-redundant protein–RNA complexes, we find that it would be unsuitable for a certain amount of complexes since the distances between proteins and RNAs vary widely. In this work, a novel definition method was proposed based on a flexible distance cutoff. This method can fully consider the individual differences among complexes by setting a variable tolerance limit of protein–RNA interactions, i.e. the double minimum-distance by which different distance thresholds are achieved for different complexes. In order to validate our method, a comprehensive comparison between our flexible method and traditional rigid methods was implemented in terms of interface structure, amino acid composition, interface area and interaction force, etc. The results indicate that this method is more reasonable because it incorporates the specificity of different complexes by extracting the important residues lost by rigid distance methods and discarding some redundant residues. Finally, to further test our double minimum-distance definition strategy, we developed a classifier to predict those binding sites derived from our new method by using structural features and a random forest machine learning algorithm. The model achieved a satisfactory prediction performance and the accuracy on independent data sets reaches to 85.0%. To the best of our knowledge, it is the first prediction model to define positive and negative samples using a flexible cutoff. So the comparison analysis and modeling results have demonstrated that our method would be a very promising strategy for more precisely defining protein–RNA binding sites.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors

Article Open access 07 June 2016

RNA-binding residues prediction using structural features

Article Open access 09 August 2015

Predicting Hot Spot Residues at Protein–DNA Binding Interfaces Based on Sequence Information

Article 17 October 2020

References

Howard GC, Brown WE (2001) Modern protein chemistry: practical aspects. CRC press, Boca Raton
Book Google Scholar
Hannigan GE, Dedhar S (1997) Protein kinase mediators of integrin signal transduction. J Mol Med (Berl) 75(1):35
Article CAS Google Scholar
Si J, Cui J, Cheng J, Wu R (2015) Computational prediction of RNA-binding proteins and binding sites. Int J Mol Sci 16(11):26303
Article CAS PubMed PubMed Central Google Scholar
Noller HF (2005) RNA structure: reading the ribosome. Science 309(5740):1508
Article CAS PubMed Google Scholar
Nachtergaele S, He C (2017) The emerging biology of RNA post-transcriptional modifications. Nat Methods 14(2):156
Google Scholar
Khalil AM, Rinn JL (2011) RNA-protein interactions in human health and disease. Semin Cell Dev Biol 22(4):359
Article CAS PubMed PubMed Central Google Scholar
Bellucci M, Agostini F, Masin M, Tartaglia GG (2011) Predicting protein associations with long noncoding RNAs. Nat Methods 8(6):444
Article CAS PubMed Google Scholar
Suresh V, Liu L, Adjeroh D, Zhou X (2015) RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information. Nucleic Acids Res 43(3):1370
Article CAS PubMed PubMed Central Google Scholar
Cirillo D, Blanco M, Armaos A, Buness A, Avner P, Guttman M, Cerase A, Tartaglia GG (2016) Quantitative predictions of protein interactions with long noncoding RNAs. Nat Methods 14(1):5
Article PubMed CAS Google Scholar
Ponting CP, Oliver PL, Reik W (2009) Evolution and functions of long noncoding RNAs. Cell 136(4):629
Article CAS PubMed Google Scholar
Wang Y, Lin Y, Guo YZ, Pu XM, Li ML (2017) Functional dissection of human targets for KSHV-encoded miRNAs using network analysis. Sci Rep (7): 3159
Liu ZY, Guo YZ, Pu XM, Li ML (2016) Dissecting the regulation rules of cancer-related miRNAs based on network analysis. Sci Rep (6): 34172
Cheng CW, Su EC, Hwang JK, Sung TY, Hsu WL (2008) Predicting RNA-binding sites of proteins using support vector machines and evolutionary information. BMC Bioinformatics 9(Suppl 12):S6
Article PubMed PubMed Central CAS Google Scholar
Tong J, Jiang P, Lu ZH (2008) RISP: a web-based server for prediction of RNA-binding sites in proteins. Comput Methods Programs Biomed 90(2):148
Article PubMed Google Scholar
Murakami Y, Spriggs RV, Nakamura H, Jones S (2010) PiRaNhA: a server for the computational prediction of RNA-binding residues in protein sequences. Nucleic Acids Res 38(Web Server issue):W412
Article CAS PubMed PubMed Central Google Scholar
Wang L, Huang C, Yang MQ, Yang JY (2010) BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features. BMC Syst Biol 4(Suppl 1):S3
Article PubMed PubMed Central CAS Google Scholar
Carson MB, Langlois R, Lu H (2010) NAPS: a residue-level nucleic acid-binding prediction server. Nucleic Acids Res 38(Web Server issue):W431
Article CAS PubMed PubMed Central Google Scholar
Ma X, Guo J, Wu J, Liu H, Yu J, Xie J, Sun X (2011) Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature. Proteins 79(4):1230
Article CAS PubMed Google Scholar
Fernandez M, Kumagai Y, Standley DM, Sarai A, Mizuguchi K, Ahmad S (2011) Prediction of dinucleotide-specific RNA-binding sites in proteins. BMC Bioinform 12(Suppl 13):S5
Article CAS Google Scholar
Puton T, Kozlowski L, Tuszynska I, Rother K, Bujnicki JM (2012) Computational methods for prediction of protein-RNA interactions. J Struct Biol 179(3):261
Article CAS PubMed Google Scholar
Walia RR, Xue LC, Wilkins K, El-Manzalawy Y, Dobbs D, Honavar V (2014) RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins. PLoS ONE 9(5):e97725
Article PubMed PubMed Central CAS Google Scholar
Perez-Cano L, Fernandez-Recio J (2010) Optimal protein-RNA area, OPRA: a propensity-based method to identify RNA-binding sites on proteins. Proteins 78(1):25
Article CAS PubMed Google Scholar
Zhao H, Yang Y, Zhou Y (2011) Structure-based prediction of RNA-binding domains and RNA-binding sites and application to structural genomics targets. Nucleic Acids Res 39(8):3017
Article CAS PubMed Google Scholar
Towfic F, Caragea C, Gemperline DC, Dobbs D, Honavar V (2010) Struct-NB: predicting protein-RNA binding sites using structural features. Int J Data Min Bioinform 4(1):21
Article PubMed PubMed Central Google Scholar
Li S, Yamashita K, Amada KM, Standley DM (2014) Quantifying sequence and structural features of protein-RNA interactions. Nucleic Acids Res 42(15):10086
Article CAS PubMed PubMed Central Google Scholar
Yang XX, Deng ZL, Liu R (2014) RBRDetector: improved prediction of binding residues on RNA-binding protein structures using complementary feature- and template-based strategies. Proteins 82(10):2455
Article CAS PubMed Google Scholar
Miao Z, Westhof E (2015) Prediction of nucleic acid binding probability in proteins: a neighboring residue network based score. 43(11):5340
Miao Z, Westhof E (2015) A large-scale assessment of nucleic acids binding site prediction programs. Nucleic Acids Res 11(12):e1004639
Google Scholar
Dey S, Pal A, Guharoy M, Sonavane S, Chakrabarti P (2012) Characterization and prediction of the binding site in DNA-binding proteins: improvement of accuracy by combining residue composition, evolutionary conservation and structural parameters. Nucleic Acids Res 40(15):7150
Article CAS PubMed PubMed Central Google Scholar
Pan X, Zhu L, Fan YX, Yan J (2014) Predicting protein-RNA interaction amino acids using random forest based on submodularity subset selection. Comput Biol Chem 53pb:324
Article PubMed CAS Google Scholar
Xiong D, Zeng J, Gong H (2015) RBRIdent: an algorithm for improved identification of RNA-binding residues in proteins from primary sequences. Structure 83(6):1068
CAS Google Scholar
Kirsanov DD, Zanegina ON, Aksianov EA, Spirin SA, Karyagina AS, Alexeevski AV (2013) NPIDB: nucleic acid-protein interaction database. Nucleic Acids Res 41(Database issue):D517
CAS PubMed Google Scholar
Zanegina O, Kirsanov D, Baulin E, Karyagina A, Alexeevski A, Spirin S (2016) An updated version of NPIDB includes new classifications of DNA-protein complexes and their families. Nucleic Acids Res 44(D1):D144
Article CAS PubMed Google Scholar
Bahadur RP, Zacharias M, Janin J (2008) Dissecting protein-RNA recognition sites. Nucleic Acids Res 36(8):2705
Article CAS PubMed PubMed Central Google Scholar
Iwakiri J, Tateishi H, Chakraborty A, Patil P, Kenmochi N (2012) Dissecting the protein–RNA interface: the role of protein surface shapes and RNA secondary structures in protein–RNA recognition. Nucleic Acids Res 40(8):3299
Article CAS PubMed Google Scholar
Barik A, C N, Pilla SP, Bahadur RP (2015) Molecular architecture of protein-RNA recognition sites. J Biomol Struct Dyn 33(12):2738
Article CAS PubMed Google Scholar
Kim OT, Yura K, Go N (2006) Amino acid residue doublet propensity in the protein-RNA interface and its application to RNA interface prediction. Nucleic Acids Res 34(22):6450
Article CAS PubMed PubMed Central Google Scholar
Wang G, Dunbrack RL Jr (2003) PISCES: a protein sequence culling server. Bioinformatics 19(12):1589
Article CAS PubMed Google Scholar
Krissinel E, Henrick K (2007) Inference of macromolecular assemblies from crystalline state. J Mol Biol 372(3):774
Article CAS PubMed Google Scholar
Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M (2008) AAindex: amino acid index database, progress report 2008. Nucleic Acids Res 36(Database issue):D202
CAS PubMed Google Scholar
Sun M, Wang X, Zou C, He Z, Liu W, Li H (2016) Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors. BMC Bioinform 17(1):231
Article CAS Google Scholar
Heinig M, Frishman D (2004) STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res 32(Web Server issue):W500
Article CAS PubMed PubMed Central Google Scholar
Hubbard SJ, Thornton JM (1998) NACCESS: program for calculating accessibilities. Department of Biochemistry and Molecular Biology, University College of London, UK
Google Scholar
Mihel J, Sikic M, Tomic S, Jeren B, Vlahovicek K (2008) PSAIA—protein structure and interaction analyzer. BMC Struct Biol 8:21
Article PubMed PubMed Central CAS Google Scholar
Piovesan D, Minervini G, Tosatto SC (2016) The RING 2.0 web server for high quality residue interaction networks. Nucleic Acids Res 44(W1):W367
Article CAS PubMed PubMed Central Google Scholar
Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27(3):431
Article CAS PubMed Google Scholar
Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci USA 98(18):10037
Article CAS PubMed Google Scholar
Dolinsky TJ, Czodrowski P, Li H, Nielsen JE, Jensen JH, Klebe G, Baker NA (2007) PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res 35(Web Server issue):W522
Article PubMed PubMed Central Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5
Article Google Scholar
Luo JS, Guo YZ, Zhong Y, Ma D, Li WL, Li ML (2014) A functional feature analysis on diverse protein–protein interactions: application for the prediction of binding affinity. J Comput Mol Des 28(6):619
Article CAS Google Scholar
Luo JS, Li WL, Liu ZY, Guo YZ, Pu XM, Li ML (2015) A sequence-based two-level method for the prediction of type I secreted RTX proteins. Analyst 140(9):3048
Article CAS PubMed Google Scholar
Wang Y, Guo YZ, Kuang QF, Pu XM, Ji Y, Zhang ZH, Li ML (2015) A comparative study of family-specific protein–ligand complex affinity prediction based on random forest approach. J Comput Mol Des 29(4):349
Article CAS Google Scholar
Wang Y, Guo YZ, Pu XM, Li ML (2017) Effective prediction of bacterial type IV secreted effectors by combined features of both C-termini and N-termini. J Comput Mol Des 3(11):1
Google Scholar
Qiu H, Guo YZ, Yu LZ, Pu XM, Li ML (2018) Predicting protein lysine methylation sites by incorporating single-residue structural features into Chou’s pseudo components. Chemometr Intell Lab Sys 179(1):31
Article CAS Google Scholar
Liu ZP, Wu LY, Wang Y, Zhang XS, Chen L (2010) Prediction of protein–RNA binding sites by a random forest method with combined features. Bioinformatics 26(13):1616
Article CAS PubMed Google Scholar
Jones S, Daley DT, Luscombe NM, Berman HM, Thornton JM (2001) Protein–RNA interactions: a structural analysis. Nucleic Acids Res 29(4):943
Article CAS PubMed PubMed Central Google Scholar
El-Manzalawy Y, Abbas M, Malluhi Q, Honavar V (2016) Fastrnabindr: fast and accurate prediction of protein-RNA interface residues. Plos ONE 11(7):e0158445
Article PubMed PubMed Central CAS Google Scholar
Allers J, Shamoo Y (2001) Structure-based analysis of protein–RNA interactions using the program ENTANGLE. J Mol Biol 311(1):75
Article CAS PubMed Google Scholar
Xie W, Liu X, Huang RH (2003) Chemical trapping and crystal structure of a catalytic tRNA guanine transglycosylase covalent intermediate. Nat Struct Biol 10(10):781
Article CAS PubMed Google Scholar
Yamashita S, Martinez A, Tomita K (2015) Measurement of acceptor-TPsiC helix length of tRNA for terminal A76-addition by A-adding enzyme. Nucleic Acids Res 23(5):830
CAS Google Scholar
Tsuchiya Y, Kinoshita K, Nakamura H (2005) PreDs: a server for predicting dsDNA-binding site on protein molecular surfaces. Bioinformatics 21(8):1721
Article CAS PubMed Google Scholar
Li T, Li QZ, Liu S, Fan GL, Zuo YC et al (2013) PreDNA: accurate prediction of DNA-binding sites in proteins by integrating sequence and geometric structure information. Bioinformatics 29(6):678
Article PubMed CAS Google Scholar
Liu R, Hu J (2013) DNABind: a hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches. Proteins 81(11):1885
Article CAS PubMed Google Scholar
Yan J, Friedrich S, Kurgan L (2015) A comprehensive comparative review of sequence-based predictors of DNA- and RNA-binding residues. Brief Bioinformatics 17(1):88
Article PubMed CAS Google Scholar

Download references

Funding

This work was funded by the National Natural Science Foundation of China (Nos. 21675114, 21573151).

Author information

Authors and Affiliations

College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, People’s Republic of China
Wen Hu, Liu Qin, Menglong Li, Xuemei Pu & Yanzhi Guo

Authors

Wen Hu
View author publications
You can also search for this author inPubMed Google Scholar
Liu Qin
View author publications
You can also search for this author inPubMed Google Scholar
Menglong Li
View author publications
You can also search for this author inPubMed Google Scholar
Xuemei Pu
View author publications
You can also search for this author inPubMed Google Scholar
Yanzhi Guo
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yanzhi Guo.

Ethics declarations

Conflict of interest

The authors declare no competing financial interests.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 296 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, W., Qin, L., Li, M. et al. Individually double minimum-distance definition of protein–RNA binding residues and application to structure-based prediction. J Comput Aided Mol Des 32, 1363–1373 (2018). https://doi.org/10.1007/s10822-018-0177-z

Download citation

Received: 29 May 2018
Accepted: 14 November 2018
Published: 26 November 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s10822-018-0177-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Individually double minimum-distance definition of protein–RNA binding residues and application to structure-based prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors

RNA-binding residues prediction using structural features

Predicting Hot Spot Residues at Protein–DNA Binding Interfaces Based on Sequence Information

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Electronic supplementary material

Supplementary material 1 (DOC 296 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now