PhD position in algorithms for the integrative modeling of viral RNA architectures

Position funded by Agence National de la Recherche (INSSANE project)
Supervision by Sebastian Will and Sarah Berkemer

Application closed

Project description

The PhD project will aim to overcome algorithmic hurdles to improve the processing of sequencing data produced by RNA structure-targeting experiments, and ultimately contribute computational structural modeling strategies of increased accuracy. All modern RNA probing protocols are based on sequencing technologies, and reveal structural information indirectly, through observable alterations at the RNA sequence level (mutations, stops/cut). Meanwhile, novel crosslinking protocols reveal interacting RNA regions. The selected candidate will design an integrative structure modeling method to interpret probing data, combine the resulting reactivity profiles with crosslinking interactions, along with evolutionary information (compensatory mutations). Complete virus architectures will be predicted as the solution of a Maximum-Independent-Set (MIS) graph problem for an associated conflict graph including both alternative local structure and long-range interactions. It will be implemented as a Fixed Parameter Tractable algorithm based on the treewidth to produce models with maximal experimental support and thermodynamic stability.

Candidate profile and skills

  • The successful candidate will hold a Master or Engineering degree in Computer Science or Bioinformatics. They should have prior knowledge in discrete Algorithms and data structures, should know basic statistics and, ideally, already possess a background in RNA bioinformatics and/or RNA biology.
  • Since the project requires concrete, efficient implementation of developed methods, they should have demonstrated programming skills in C++ and Python; Java may be used for visualization, and will be considered a plus
  • Good command of English is required; but French will be appreciated

Context

Understanding the structure of RNA molecules and their complexes is essential for modern molecular biology; directly impacting fundamental insights and bio-medical applications. Notorious examples of large RNAs include the genomes of RNA viruses (Influenza, HIV, Chikungunya, SARS-CoV2…), whose lengths exceed the current capabilities of predictive computational methods, as well as high-res experimental structural techniques. Within the ANR-funded project INSSANE, we are developing integrated experimental protocols, together with efficient computational methods for the structural modeling of large RNAs. We accurately probe and predict the genomic RNA architectures of, bio-medically relevant, viruses. Our bioinformatics methods will be applicable even beyond viruses, and could be used to model the structure of other large RNAs (lncRNAS, Introns).

References

  • [1] N. A Siegfried, S Busan, G. M Rice, J. A Nelson, and K. M Weeks. RNA motif discovery by SHAPE and mutational profiling (SHAPEmap). Nature methods, 11(9):959–965, 2014. doi: 10.1038/nmeth.3029.
  • [2] O Ziv, J Price, L Shalamova, T Kamenova, I Goodfellow, F Weber, and E. A Miska. The short- and long-range RNA-RNA interactome of SARS-CoV-2. Mol. Cell, 2020. doi: 10.1016/j.molcel.2020.11.004.
  • [3] R Lorenz, D Luntzer, I. L Hofacker, P. F Stadler, and M. T Wolfinger. SHAPE directed RNA folding. Bioinformatics, 32(1):145–147, 2016. doi: 10.1093/bioinformatics/btv523.
  • [4] A Spasic, S. M Assmann, P. C Bevilacqua, and D. H Mathews. Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Research, 46(1):314–323, 2017. doi: 10.1093/nar/gkx1057.

More information

Proposal online: https://adum.fr/as/ed/voirproposition.pl?langue=en&site=X&matricule_prop=48153#version