Over the past decade, my research interests have been at the intersection of Computer Science, Mathematics and Molecular Biology. On the applied level, I mainly design analytic approaches, efficient algorithms and tools to answer key questions in Bioinformatics, with a special focus on RNA biology. These questions include, but are not limited to:
- How to predict RNA structure in the presence of pseudoknots?
- What is the prevalence of kinetics within the RNA folding process?
- What is the interplay between RNA structure and evolution?
- How can a knowledge of the structure of RNA help in the analysis of experiments?
Conversely, how to use coarse-grain experimental data to perform an accurate RNA structure prediction?
- How to design RNA sequences that perform predefined functions in vivo?
Some of these questions are related to universal properties of biopolymers. In such cases, they do not necessarily require considering a specific sequence, or overly sophisticated (and complex) energy models. Provided they can be accurately rephrased at such an abstract level, I typically strive to provide (asymptotical) analytic results using standard tools and techniques (generating functions, singularity analysis...) borrowed from the fields of enumerative combinatorics and analytic combinatorics.
More complicated questions may still lend themselves quite nicely to exact resolution through polynomial time/space algorithms, usually based on dynamic programming. The concepts and design principles underlying such tools can sometimes be generalized to some other application contexts in bioinformatics, such as comparative genomics.
Sometimes, the problem turns out to be computationally intractable, or provably hard in the well-defined meaning given to the term by the field of computational complexity theory. In these situations, I try to establish what makes the problem hard, and how to possibly work around the hardness result either by adopting a classic parameterized complexity approach, or by simplifying the model in order to achieve an acceptable tradeoff between expressivity and tractability.
In the extreme cases where the problem is hard to analyze, or increasingly as a first approach to test exploratory hypotheses, I tend to adopt a probabilistic perspective based on random sampling within adequately controlled distributions, such as the uniform distribution or the Boltzmann distribution.