Laboratoire d'informatique de l'École polytechnique

Comparative prediction of RNA structure

Speaker: Sebastian Will (AMIBio team)
Location: URL: https://cisco.webex.com/cisco/j.php?MTID=mae752ddacf0979949253908809455d2a
Date: Thu, 18 Mar 2021, 13:00-14:00

In this talk, I am going to sketch a subjective history of comparative RNA structure prediction. For this purpose, I describe exact combinatorial algorithms, which combine physics-based models with evolutionary information. The scientific interest in RNAs, these molecular key players of life, is ever growing, since massive sequencing revealed their numerous, essential roles. More recently, RNAs received even more popular attention: RNA is the genomic material of Corona viruses and RNA-based Covid19 vaccines became especially relevant.

Following the demands of biological research, computer science had a large impact on RNA research by studying RNA secondary structures as combinatorial objects of context-free nature. This perspective enabled efficient, exact optimization algorithms. In an interdisciplinary endeavor, such methods moreover integrate different sources of information to predict RNA structures reliably. On the one hand, chemistry and physics provide us with accurate thermodynamic models that allow predicting energetically optimal structures for an RNA of a given sequence. On the other hand, one can improve predictions due to picking up imprints of evolutionary selective pressure.

In cases where the evolutionary relation between several RNAs is already known, predicting structures with evolutionary support is not more complex than predicting single structures. However, when it is not known a priori, this evolutionary relation can be uncovered reliably only by simultaneously predicting the RNA structures. While the first algorithms for this central task of “simultaneous folding and alignment” (SFA) were prohibitively complex, I achieved a major break through by exploiting the sparsity of the problem. I am presenting several generations of combinatorial SFA algorithms and explain their controlled use of sparsity. Finally, I will discuss results from biological applications as well as exciting related open problems.