Date of Graduation
Doctor of Philosophy (PhD)
George A. Calin
The past decade has witnessed an era of RNA biology; despite the considerable discoveries nowadays, challenges still remain when one aims to screen RNA-interacting small molecule or RNA-interacting protein. These challenges imply an immediate need for cost-efficient while predictive computational tools capable of generating insightful hypotheses to discover novel RNA-interacting small molecule or RNA-interacting protein. Thus, we implemented novel computational models in this dissertation to predict RNA-ligand interactions (Chapter 1) and RNA-protein interactions (Chapter 2).
Targeting RNA has not garnered comparable interest as protein, and is restricted by lack of computational tools for structure-based drug design. To test the potential of translating molecular docking tools designed for protein to RNA-ligand docking and virtual screening, we benchmarked 5 docking software and 11 scoring functions to assess their performances in pose reproduction, pose ranking, score-RMSD correlation and virtual screening. From this benchmark, we proposed a three-step docking pipelines optimized for virtual screening against RNAs with different flexibility properties. Using this pipeline, we have successfully identified a selective compound binding to GA:UU motif. Both NMR and the subsequent MD simulation proved its selective binding to GA:UU motif flanked by two tandem flexible base pairs next to GA. Consistent to the 3D model, SAR analysis revealed that any R-group substitution would abolish the binding.
Current computational methods for RNA-protein interaction prediction (sequence-based or structure-based) are either short of interpretability or robustness. Aware of these pitfalls, we implemented RNA-Protein interaction prediction through Interface Threading (RPIT), which identifies and references a known RNA-protein interface as the template to infer the region where the interaction occurs and predict the interacting propensity based on the interface profiles. To estimate the propensity more accurately, we implemented five statistical scoring functions based our unique collection of non-redundant protein-RNA interaction database. Our benchmark using leave-protein-out cross validation and two external validation sets resulted in overall 70%-80% accuracy of RPIT. Compared with other methods, RPIT offers an inexpensive but robust method for in silico prediction of RNA-protein interaction networks, and for prioritizing putative RNA-protein pairs using virtual screening.
RNA-small molecule interaction, RNA-protein interaction, molecular docking, virtual screening, NMR, scoring function, molecular dynamics, interface threading, microRNA