# LIX Seminar

The LIX Seminar is intended to encourage exchanges between teams and members of the laboratory. It is open more generally to the entire local community, and in particular to students of the Institut Polytechnique de Paris. The format adopted is a presentation followed by discussions lasting one hour, once a month (1 p.m.-2 p.m. on the 3rd Thursday of the month).

Unless overwise stated, the seminar takes place at 13:00 in salle Gilles Kahn of the Turing Building, with coffee 15 minutes before. It is also accessible remotely on this webex link.

You can:

And please contact me for suggestions and contributed talks.## Upcoming seminars

#### TBA

**Thomas Debris-Alazard**(Grace team)

6 June 2024.

## Past seminars

#### Quantum simulation with Neural Quantum States

**Filippo Vicentini**(CPHT)

25 April 2024.

video

Machine Learning algorithms have advanced tremendously in the last two decades, tackling difficult problems such as image and natural language recognition, classification or generative synthesis. Part of the success owes to the development of variational approximators which can efficiently approximate unknown functions living in high-dimensional spaces: Neural Networks. Neural Networks have recently been proposed as efficient approximations for the exponentially-complex Wave-function of quantum systems, and in recent years several algorithms to compute the ground state of complicated interacting systems or to simulate their dynamics have been proposed. Even if the field is rapidly expanding there remain some foundational open questions: what features of quantum states can be efficiently compressed by neural networks? How does a neural network “learn” a quantum state? In this talk I will give a broad overview to the field, attempting to sketch a connection between quantum information, Resource theory and machine learning. Then, I will discuss how it is possible to efficiently simulate the dynamics Of a quantum system by repeatedly compressing the wave-function, side-stepping several issues with pre-existing algorithms.

#### Hallucinations in textual generation

**Oana Balalau**(Cedar team)

21 March 2024.

video

In my work, I have studied two inverse problems: extracting facts in the form (subject, predicate, object) from text, and, given such facts, transforming them back into text. These problems are very interesting as they touch upon the issue of hallucinations in large language models. But they also open the question of when two sentences describe the same facts and how we can measure textual similarity in an informative and explainable manner. In this presentation, I will give a short overview of hallucinations and constraint generation, i.e., forcing the model to respect constraints defined for the task.

#### Real-Time Expressive Dynamic Animation for Skeleton Based Character

**Damien Rohmer**(Vista team)

15 February 2024.

video

Expressive animation effects featuring, for instance, exaggerated deformations are commonly used to enhance dynamic-related effects in virtual characters for 3D production such as Cartoon entertainment or Video Games. Creating such effects requires either a tedious manual set-up of geometric deformers that are character and time-dependent, or the use of computationally costly physics-based simulations.
In this presentation, we will introduce a general framework suited to
synthesize automatically inertia-liking effects on arbitrary
hierarchical character, while remaining extremely computationally
efficient. The method can be seen as an extension of the, so called,
Linear Blend Skinning (LBS), which is the most standard method
currently used in production pipeline to compute 3D character
deformations in real-time.

Starting from the standard LBS formulation, we will derive its relation to joint-based velocity along the kinematic chain of the character, and show that the vertex velocity and acceleration can also be decomposed into a LBS-like relation. We will leverage this relation to propose a per-vertex deformation framework taking into account the skeleton joint velocity and the character hierarchy. This new relation can then be used to parameterize, at the local joint level of the character, spatial and temporal deformers while relying on skinning weights to propagate through the hierarchy. This approach has the advantage of fully relying on existing rigs which are readily available in animation production studios, and thanks to its parametric formulation, the deformation can be computed in real-time in a single vertex shader. Along this presentation, we will show various dynamic effects that can be applied within this framework, as well as its art-directability.

#### Jacobi's bound. From combinatorial optimization to aircraft control

**François Ollivier**(Max team)

18 January 2024.

video

From 1836 to 1844, Jacobi developed a theory of differential equations that was meant to take place in a book entitled “Phoronomia”. After renouncing to this project, some manuscripts remained unpublished in his lifetime. Among them, a bound, still conjectural in the general case, on the order of a differential system, expressed by a “tropical determinant”, i.e. replacing in the determinant formula, products by sums and sums by max.

Jacobi gave a polynomial time algorithm to compute it, instead of trying the n! number of permutations. This algorithm relies on the notion of “minimal canon”, which is computed by building a “path relation”. In modern terms, Jacobi builds a weighted oriented graph and computes shortest paths. He gave indeed an algorithm to compute the minimal canon, knowing a canon that is equivalent to Dijkstra’s algorithm, and a second algorithm to compute a canon, knowing a permutation that gives the maximum, which is equivalent to Bellman–Ford algorithm.

An equivalent algorithm, the “Hungarian method”, that uses Egerváry’s “minimal covers”, was obtained by Kuhn in 1955 to solve the “assignment problem”, which is to assign in the best way n workers to n jobs, with many applications, for example to optimize transportation problems.

When there are more functions than equations, under some genericity hypotheses, we can still bound the order using the maximum of all tropical determinants obtained by considering square submatrices of the order matrix, and this may be done in polynomial time. But what about the “saddle Jacobi number”, which is the minimum of those tropical determinants? No polynomial time algorithm seems to be known to compute it. Yet, in a work with Y. Kaminski, we could provide a polynomial time algorithm to test is the saddle Jacobi number is 0. In such a case, if the associated Jacobian determinant does not vanish, the system is “flat”. This important property of control systems, introduced by Fliess, Lévine, Martin and Rouchon, means that one may parametrize the solutions of the system, using functions of some “flat outputs” and their derivatives. Here, the flat outputs are those functions in the system that were set aside to restrict to a square system with a square order matrix.

Flat systems with such a property are “oudephippical systems”, that generalize many notions of chained systems. We can easily define for them sets of flat outputs and their regularity conditions. This allows to choose flat outputs adapted for the desired trajectory. With some simplifications, aircraft models are flat as shown by Martin in 1991, but the flat outputs he introduced do not work for 0-gravity flight. We were able to provide a new set of flat outputs that work in that case and to show that the only possible flat singularities of the simplified aircraft model correspond to stalling conditions.

Joint work with Yirmeyahu J. Kaminski, Holon Institute of Technology, Israel

For more details :

http://www.lix.polytechnique.fr/%7Eollivier/PRODUCTION_SCIENT/publications/jacobi3V17.2.pdf

http://www.lix.polytechnique.fr/%7Eollivier/PRODUCTION_SCIENT/publications/aircraft-hal-V2.pdf

#### Bell theorem and its generalization: From foundations to applications

**Marc-Olivier Renou**(INRIA Saclay & DIX)

14 December 2023.

video

Quantum information theory has several puzzling properties. In particular, it predicts that information carried by quantum objects is incompatible with the use of the logical ‘or’. While in our intuition of the world, a cat hidden in a box can be dead or alive, according to quantum mechanics this assertion is not valid anymore. Hence, we need a new concept: we say that a quantum cat can be dead ‘and’ alive. This new concept is however hard to accept: couldn’t we imagine a new theory replacing quantum information theory and compatible again with our ‘or’? In 1964, Bell proved that, in some sense, this puzzling feature of quantum theory is a necessity: there does not exist any ‘reasonable’ theory of information (which would replace quantum information theory) compatible with this logical ‘or’. Bell’s seminal proof is based on quantum correlations, which are obtained when multiple parties perform independent measurements on a shared quantum state, and on very weak assumptions. I’ll first review the Bell theorem. Then, I will discuss some of its foundational main consequences and some of its concrete applications to certification of quantum devices. At last, I will discuss recent generalization of this theorem in the context of causal quantum networks, in which multiple parties perform independent measurements on several, independent quantum states. In particular, I’ll show that Real Quantum Theory (the information theory obtained replacing complex by real numbers in standard Quantum Information Theory) can be ruled out by a Bell-like experiment, and discuss questions related to quantum distributed computing.

#### A diagram editor to mechanise categorical proofs

**Ambroise Lafont**(Partout team)

23 November 2023.

video

Diagrammatic proofs are ubiquitous in certain areas of mathematics, especially in category theory. Mechanising such proofs is a tedious task because proof assistants (such as Coq) are text based. We present a prototypical diagram editor to make this process easier, building upon the vscode extension coq-lsp for the Coq proof assistant and a web application (you can try it on my personal website).

#### Why are they doing quantum computer science?

**Titouan Carette**(Cosynus team)

19 October 2023.

video

Quantum computing could be defined as the art of harvesting quantum weirdness for computational purposes. But as computer science is about a little more than computers, this definition is not fully adequate and thus only provides an incomplete answer to the question: Why are we doing quantum computer science? This talk aims to provide a hopefully more accurate depiction of the field and its stakes. I will succinctly present the basics of quantum computer science and roughly sketch the landscape of its different subfields. Then, I'll present various motivations for studying quantum mechanics' computational aspects based on my experience and my work on the ZX-calculus. I will argue that quantum computing is a relevant addition to computer science beyond the technological challenge of building and exploiting a quantum computer.

#### Compact data structures for triangle meshes: fast solutions with theoretical guarantees

**Luca Castelli Aleardi**(Combi team)

21 September 2023.

video

We consider the problem of designing fast and compact solutions for
representing the connectivity information of triangle meshes. Classical
data structures (such as Half-Edge, Corner Table, ...) are redundant and
store a significant amount of information in order to offer a
user-friendly interface and a fast implementation of common mesh
navigational operators. Their memory consumption is expressed in terms of
references per vertex (rpv) and typically ranges from 13 to 19 rpv which
may prevent their use in the case of very large graphs. Compression
schemes allows us to match asymptotically the information-theory lower
bounds and to encode a triangulation with few bits per vertex, but they do
not permit to navigate the graph structure without first decompressing the
entire mesh.

In this work, we introduce novel compact representations for triangle
meshes that take graph regularity into account and offer various
trade-offs between space requirements and runtime performance (local
navigation is performed in O(1) time). For instance, one of our solutions
exhibits a storage cost of 2 rpv in the case of very regular
triangulations (with a large majority of low degree vertices), while
having a worst case cost of 3.67 rpv. We also propose another structure, a
bit more involved, that requires at most 3.33 rpv in the worst case: this
is currently the best-known worst case bound that outperforms all previous
results.

Our experimental results confirm the practical interest of our compact
data structures, which are only slightly slower (between 1.7x and 3.8x
slower) than classical mesh data structures.

This is joint work with Olivier Devillers, Loria.

#### Active queue management for alleviating Internet congestion: Preliminary results via new control-theoretic tools

**Michel Fliess**(Optimix team)

8 June 2023.

In order to mitigate Internet congestion, this work is among the first
ones to use a 20 years old modeling via a nonlinear differential equation
with a variable delay, where a new flatness-like property is encountered:
the delay is a flat output. Combining flatness-based open-loop control and
closed-loop control via the intelligent proportional controller deduced
from model-free control yields easily implementable and convincing
computer experiments which display a remarkable robustness with respect to
large uncertainties on the number of TCP connections.

The above nonlinear modeling has mainly been employed until today to
derive time-invariant linear delay approximate systems, which are quite
popular, not only for investigating queue management but also for computer
experiments. The flatness-like property of the nonlinear model shows that
freezing the delay implies that all other system variables, including the
control one, are kept constant. The validity of the linear approximations
is therefore questioned, from a control standpoint as well as for digital
simulations.

This is joint work with H. Mounier (Université Paris-Sud), C. Join (Université de
Lorraine), E. Delaleau (École Nationale d'Ingénieurs de Brest)

#### Making proofs by hand: attempt for a novel interface for Coq

**Benjamin Werner**(Partout team)

11 May 2023.

video

Interactive formal proofs systems like HOL, Isabelle, Lean or Coq have generally two levels of language:

- One is the language of mathematical objects and statements. These go from « 2+2=4 » to the most complex theorems.

- The second is a language to allow the user to progressively construct mechanically verified proofs of these statements. Very often this is a kind of script language built around basic commands called « proof tactics ».

Learning and mastering this proof language is difficult. One can consider this to be a hinderance for a wider use of proof systems; especially in education but also in the mathematical community in general.
I will present and demonstrate the prototype of Actema, a novel user interface for building formal proofs without text but through gestures on the screen. We hope this approach is a step towards more intuitive and easy to use proof systems.
More specifically, I will present a version of Actema which acts as a front end to the Coq theorem prover.
I will also present the logical mchanisms at play, which involve the approach called Deep Inference, and say a few words about how the gestures are translated into Coq proofs terms.
I hope this to be an entertaining talk, and am curious about the feedback from the audience.

A first version of the interface is accessible on-line on:
http://actema.xyz/
The version I will show is more advanced, because it relies on Coq as a back-end.

This is joint work with Kaustuv Chaudhuri, Pablo Donato and Pierre-Yves Strub.

#### Can we safeguard the micro-targeting of political ads? An algorithmic and regulatory perspective

**Oana Goga**(Cedar team)

20 April 2023.

video

Ad platforms have been used in the past years as a weapon by political actors to engineer polarization, promote voter disengagement and influence/manipulate voters. For example, during the 2016 U.S. Presidential Election, targeting technologies have been abused by Cambridge Analytica to send messages that resonated with each user’s personality, while the Russian Internet Research Agency has instrumented targeting technologies to instigate social division.
In this presentaiton, I will present several works we did on making political advertising safer:

(i) Behavior analysis: We have analyzed the political ad campaign sent by the Russian Internet Research Agency (IRS) before the U.S. Presidential elections to understand what mechanisms have been exploited to manipulate voters.

(ii) Auditing: To make advertisers more accountable, Facebook asks advertisers to self-declare if they are sending political ads and provides a central repository (the Facebook Ad Library) that contains information about the political ads that have run on their platform. Ad Libraries are a positive step forward as they allow researchers and journalists to scrutinize political ads, but come with limitations. One big problem is that advertisers need to self-declare whether their ad is political, and we do not know what enforcement mechanisms are put in place by Facebook to check compliance. Hence, we have no idea how many (undetected) political ads are incorrectly missing from the Ad Library? We have proposed an auditing system to assess how many political ads are missing from the Ad Library.

(iii) Content moderation: In a follow-up study, we wanted to understand, at a more fundamental level, if there is hope to reliably distinguish political from non-political ads (e.g., given an ad, can users agree whether it is political or not). This is important because political ads are subject to higher levels of transparency and restrictions both from ad platforms and country specific legislations; and unreliability can lead to discrimination and circumvention.

From a legal perspective, I will present how we articulated these findings and contributed to European law through the Digital Services Act and the European Democracy Action Plan.

#### The need for causality to address fairness in ML

**Sami Zhioua**(Comete team)

16 February 2023.

video

Addressing the problem of fairness is crucial to safely use machine learning algorithms to support decisions with a critical impact on people's lives such as job hiring, child maltreatment, disease diagnosis, loan granting, etc. Several notions of fairness have been defined and examined in the past decade, such as statistical parity and equalized odds. The most recent fairness notions, however, are causal-based and reflect the now widely accepted idea that using causality is necessary to appropriately address the problem of fairness.

The main objective of our research is to measure discrimination as accurately as possible. To this end, we make a distinction between the concepts of "bias" (a deviation of an estimation from the quantity it estimates) and "discrimination" (the unjust or prejudicial treatment of different categories of people on the ground of race, age, gender, disability, etc.).

In this seminar we start by illustrating why causality is essential to reach this objective of accurately measuring fairness. Then, we show how causality can be used to characterise and quantify four types of bias, namely, representation bias, confounding bias, selection bias, and measurement bias. Finally, we briefly present the different research directions we are pursuing at Comète team related to ethical/responsible AI.

#### FUSIC: Accurate WiFi-based Indoor Localization in the Presence of Multipath

**Kevin Jiokeng**(Epizeuxis team)

12 January 2023.

video

Localization inside buildings, where GPS is generally not accessible, stands as a major scientific and economic challenge over the last two decades. Recently, IEEE has thrown its weight in the balance by standardizing the Fine Timing Measurement (FTM) protocol, a time-of-flight based approach for ranging which has the potential to be a turning point in bridging the gap between the rich literature in this domain and the so-far tepid market adoption. However, experiments with the first WiFi cards supporting FTM show that while it offers meter-level ranging in clear line-of-sight settings (LOS), its accuracy can collapse in non-line-of-sight (NLOS) scenarios.

In this talk, after a brief presentation of main WiFi-based indoor localization techniques, I will present FUSIC, our approach that extends FTM’s LOS accuracy to NLOS settings, without requiring any changes to the standard. FUSIC leverages the results from FTM and MUSIC – both erroneous in NLOS – into solving the double challenge of 1) detecting when FTM returns an inaccurate
value and 2) correcting the errors as necessary. Experiments in 4 different physical locations reveal that a) FUSIC extends FTM’s LOS ranging accuracy to NLOS settings – hence, achieving its stated goal; b) it significantly improves FTM’s capability to offer room-level indoor positioning.

#### Infectious Diseases in the SIRS Model Are Epidemic on Expanders

**Martin Krejca**(AlCo team)

15 December 2022.

video

The SIRS model is an abstract representation of an infectious disease on a graph in which susceptible (S) vertices become infected (I) when adjacent to an infected vertex, infected vertices become recovered (R) and are immune to becoming infected, and recovered vertices become susceptible again. Each transition occurs at a certain (random) rate.

An important question for such models is for which topologies and rates the respective disease is epidemic, that is, it survives for a long time in expectation. The rate of infection for which a process is epidemic is known as the epidemic threshold (for a given graph). For a reduced variant of the SIRS model, which removes the recovered state from the process, known as the SIS model, epidemic thresholds on various graph classes are known. Most strikingly, stars have a very low epidemic threshold for the SIS model. Surprisingly, for the SIRS model, no such mathematical guarantees existed so far.

In this talk, I present the very first rigorous results for the epidemic threshold of the SIRS model. Our focus are the high-level ideas of our proofs. We show that, in stark contrast to the SIS model, the SIRS model has, for a large regime of the recovery rate, an expected polynomial survival time on stars, regardless of the infection rate. However, on expander graphs, the epidemic threshold is similar to that of the SIS model and thus very low. This result even holds if the expander is a subgraph, which is far from trivial in the SIRS model.

#### From theory to practice, from the PCP theorem to proofs in blockchains

**Daniel Augot**(Grace team)

17 November 2022.

video

Languages of the NP complexity class have instances
with a proof of membership which can be checked in polynomial
time. What about reading only a random part of such a proof ?

The PCP (probabilistically checkable proofs) Theorem states that any
language in NP has a verifier which checks a (PCP) proof of polynomial
size, by reading only a constant number of random bits of it.

Originally an astronomical theorem, 30 years of research have turned
this result into practical algorithms and protocols. Actually they
are so practical that companies implement and deploy such proof
systems very successfully in the blockchain world.

In this talk, the PCP theorem with be recalled. Then, the transition
from complexity theory to cryptographic protocols is done with
cryptographic hash functions. Central to this topic are error
correcting codes. Finally, we will explain why such proofs are a
useful for blockchains with low bandwidth.

#### Graph Representation Learning via Graph Neural Networks

**Johannes Lutzeyer**(Dascim team)

20 October 2022.

video

Graph Neural Networks (GNNs) have celebrated many academic and industrial successes in the past years; providing a rich ground for theoretical analysis and achieving state-of-the-art results in several learning tasks. In this talk, I will give an accessible introduction to the area of Graph Representation Learning with a focus on GNNs. I will then give an introductory overview of three recent projects I have worked on. Firstly, we explore the question of how to optimally represent graphs by learning a parametrised graph representation in GNN frameworks. Then, I will discuss work in which we design a GNN to consider neighbourhoods in a graph as a whole, instead of only considering nodes pairwise as is usually done in GNNs. Finally, I will introduce our recent Graph Autoencoder framework which is capable of both detecting communities and predicting links in various real-world graphs, notably including industrial-scale graphs provided by the Deezer music streaming service.

#### Integer multiplication in time O(n log n)

**Joris van der Hoeven**(Max team)

22 September 2022.

Integer multiplication is one of the oldest mathematical operations and still a central problem for computer arithmetic. The complexities of many other basic operations such as division, base conversion, gcds, computing e and pi, FFTs, etc. can be expressed in terms of the complexity of integer multiplication. In our talk, we will survey a new algorithm for multiplying two n-digit integers in time O(n log n).

Joris van der Hoeven and David Harvey received the de Bruijn Medal 2022 for this work.

#### Compositional properties of alignments

**Sarah Berkemer**(Amibio team)

16 June 2022.

video

Alignments, i.e., position-wise comparisons of two or more strings or ordered lists are of utmost practical importance in computational biology and a host of other fields, including historical linguistics and emerging areas of research in the Digital Humanities. The problem is well-known to be computationally hard as soon as the number of input strings is not bounded. Due to its practical importance, a huge number of heuristics have been devised, which have proved very successful in a wide range of applications. Alignments nevertheless have received hardly any attention as formal, mathematical structures. Here, we focus on the compositional aspects of alignments, which underlie most algorithmic approaches to computing alignments. We also show that the concepts naturally generalize to finite partially ordered sets and partial maps between them that in some sense preserve the partial orders. As a consequence of this discussion we observe that alignments of even more general structure, in particular graphs, are essentially characterized by the fact that the restriction of alignments to a row must coincide with the corresponding input graphs. Pairwise alignments of graphs are therefore determined completely by common induced subgraphs. In this setting alignments of alignments are well-defined, and alignments can be decomposed recursively into subalignments. This provides a general framework within which different classes of alignment algorithms can be explored for objects very different from sequences and other totally ordered data structures.

#### Checking Correctness against and under Database Isolation Levels

**Constantin Enea**(Cosynus team)

19 May 2022.

video

Modern applications, such as social networking systems and e-commerce platforms are centered around using large-scale databases for storing and retrieving data. In the presence of concurrent accesses, databases trade off isolation for performance. The weaker the isolation level, the more behaviors a database is allowed to exhibit and it is up to the developer to ensure that their application can tolerate those behaviors. However, these weak behaviors only occur rarely in practice and outside the control of the application, making it difficult for developers to check the robustness of their code against weak isolation levels.

In this talk I will present a framework for formalizing database isolation levels and algorithmic methods for checking whether a database conforms to a certain isolation level or that an application satisfies its intended specification when run under a given isolation level.

#### Modeling Bias and Polarization in Social Networks

**Frank Valencia**(Comete team)

21 April 2022.

video

Consider a group of users (agents) in a social network among whom some process of opinion formation takes place. In general, these users will take into account the opinions of others subject to their own biases. Indeed users in social networks may shape their beliefs by attributing more value to the opinions of influential figures. This common cognitive bias is known as *authority bias* [1]. Furthermore, social networks often target their users with information that they may already agree with to keep engagement. It is known that users tend to give more value to opinions that confirm their own preexisting beliefs [2] in another common cognitive bias known as *confirmation bias*. As a result, users can become radical and isolated in their own ideological circle, causing dangerous splits in society [3] in a phenomenon known as polarization [2].

There is a growing interest in studying opinion formation, consensus and polarization in the context of social networks by adapting measures and models from economics and statistics such as Degroot learning and Esteban and Ray’s polarization. Roughly speaking in these models the agents update their beliefs about the proposition of interest taking into account the beliefs of their neighbors in an underlying weighted influence graph. Agents update their beliefs giving more value to the opinion of agents with higher influence (authority bias) and to the opinion of agents with similar views (confirmation bias). Esteban and Ray’s measure captures the intuition that polarization is accentuated by both intra-group homogeneity and inter-group heterogeneity.

In this talk I will motivate and introduce some of these models and measures, present our contributions to this subject and discuss some ongoing and future work. In particular, our more insightful result establishes that, under some natural assumptions, if polarization does not eventually vanish then either there is a disconnected subgroup of agents, or some agent influences others more than she is influenced.

[1] Ramos,V.J.: Analyzing the Role of Cognitive Biases in the Decision Making Process. IGI Global (2019)

[2] Aronson, E., Wilson, T., Akert, R.: Social Psychology. Upper Saddle River, NJ : Prentice Hall, 7 edn. (2010)

[3] Bozdag, E.: Bias in algorithmic filtering and personalization. Ethics and Information Technology (2013).

#### A cartographic quest between lambda calculus, logic, and combinatorics

**Noam Zeilberger**(Partout team)

17 March 2022.

LambdaComb is an interdisciplinary ANR project whose broad goal is to deepen connections between lambda calculus and logic on the one hand and combinatorics on the other. One important motivation for the project is the discovery over recent years of a host of surprising links between subsystems of lambda calculus and enumeration of graphs on surfaces, or "maps", the latter being an active subfield of combinatorics (cf. another recent LIX seminar!) with roots in W. T. Tutte's work in the 1960s. In the talk I will give a broad overview of these connections, as well as some indications of what we can hope to learn by exploring them further.

#### Exploiting the unreasonable effectiveness of geometry in computing

**Mathieu Desbrun**(GeomeriX team)

17 February 2022.

video

Computer graphics, as routinely seen in movies nowadays, involves increasingly sophisticated numerics. While differential equations used to model reality can be numerically simulated in countless manners, the geometric structures these equations are supposed to preserve are often lost in the process. Through various vignettes describing recent developments, I will argue for a geometric, variational approach to graphics and computational physics for improved numerics and robustness: besides blurring the line between continuous mathematics and the resolutely discrete nature of digital data, a discrete variational approach to geometry is a powerful tool for applications. The importance of approaching computations through the lens of geometry will be a recurring theme throughout the talk.

#### Implicit Proofs of Membership

**Olivier Blazy**(Grace team)

20 January 2022.

video

Cryptography is more and more about manipulating secret / encrypted data and managing to do operations associated with their plaintext. One of the major advance in cryptography
came from Zero-Knowledge Proofs of Knowledge, that allowed to prove the veracity of a statement without leaking any more information.
This technique lead to several major improvements in protocols. However, it leaks the fact that you can indeed prove something.And this could become quite critical, for example :
without knowing what's in your patient file, learning that every week an oncologist accesses it might be an information you want to keep secret.
To this end, in this talk, we are going to have a look at Implicit Proofs of Knowledge, and see how they could supersede ZKPK in most modern application for a gain both in privacy and
efficiency.

We are going to focus on Smooth Hash Proof Systems, a tool presented by Cramer and Shoup to build compact efficient CCA2 encryption in the standard model. We are going to see, that it could be used to achieve a completely different goal, that is more in line with modern protocols.
We will start by a panorama of languages that can be managed by such a primitive and then show how this is enough to instantiate efficiently various protocols.
We will provide examples of such constructions first with 'vanilla' cryptography (elliptic curve, paillier) but also show that post-quantum constructions can be achieved with a non-prohibitive efficiency in both lattice and code based cryptography, widening the range of primitive available under those hypotheses.

#### Local limit of random discrete surface with (or without !) a statistical physics model

**Marie Albenque**(Combi team)

16 December 2021.

video

Random planar maps (which correspond to planar graphs embedded in the plane) are a very natural model of random discrete surface, which have been widely studied in these last 30 years (in particular by several members of the Combi team !). In my talk, I will present some results of convergence — in the local limit sense, as introduced by Benjamini and Schramm -- for those models. I will try to give an overview of this field: I will first consider models of maps which are sampled from a uniform distribution, for which many results are available. I will then move to random planar maps sampled with a weight which comes from a statistical physics model, for which many problems are still open and yield fascinating research perspectives.

#### Mathematical optimization to guarantee safety in urban air mobility

**Claudia d'Ambrosio**(Optimix team)

18 November 2021.

video

Mathematical optimization is a formal language that allows expressing an optimization problem as a mathematical model. In particular, the decisions to be made are represented by variables and technical/strategic constraints by inequalities, while a function of the variables is optimized. In this seminar, we present how such a powerful tool can be applied in the urban air mobility context to solve the tactical deconfliction (TD) problem. TD is an online problem aimed at identifying and fixing potential conflict between each pair of aircraft passing through the same air space. While TD is a well-known problem in air traffic management, its specialization to the urban air mobility case has some peculiarities. Our mathematical model is one of the first attempts to fully formalize such a problem and to propose a solution to it. We test our method on three different scenarios, where the following sources of conflict are generated: i. standard conflicts due to delay w.r.t. the original planning; ii. a priority flight that has to be integrated into the original planning; iii. an intruder flight. The tested instances are generated to be realistic. We compare our approach to the standard local deconfliction and the fairness-oriented version.

#### High-Assurance and High-Speed Cryptographic Implementations

**Pierre-Yves Strub**(Typical (ex-)team)

14 October 2021.

video

The security of communications heavily relies on cryptography. For
decades, mathematicians have invested an immense effort in providing
cryptography that is built on firm mathematical foundations. This has
lead to a collection of cryptographic primitives whose correctness can
be proved by drawing on results from branches of mathematics such as
complex analysis, algebraic geometry, representation theory and number
theory. A stated goal of recent competitions for cryptographic
standards is to gain trust from the broad cryptography community
through open and transparent processes. These processes generally
involve open-source reference and optimized implementations for
performance evaluation, rigorous security analyses for provable
security evaluation and, often, informal evaluation of security
against side-channel attacks. These artefacts contribute to building
trust in candidates, and ultimately in the new standard. However, the
disconnect between implementations and security analyses is a major
cause for concern.

In this talk, I will present the Jasmin-EasyCrypt project that
explores how formal approaches could eliminate this disconnect and
bring together implementations (most importantly, efficient
implementations) and software artefacts, in particular machine-checked
proofs, supporting security analyses.

#### Multi-Output Prediction: On the Question of Modeling Outputs Together (When; Why; And Implications for Transfer Learning)

**Jesse Read**(Dascim team)

23 September 2021.

video

In machine learning we often need to build models that predict multiple outputs for a single instance (we can point to the large areas of multi-label classification, and multi-target regression, involving a applications in diverse domains: text categorization, image labeling, signal classification, time-series forecasting, recommender systems, ...). There is a common assumption through much of this literature that one should model the outputs together due to the presence of dependence among them. Intuitively this makes sense, but others argue, often convincingly, that modeling the outputs independently is sufficient. Much of this discrepancy can be resolved after knowing which loss metric(s) are under consideration, however there is a more interesting story to tell since empirical and theoretical results sometimes contradict each other, and years of activity in these areas still do not give us a full picture of the mechanisms behind the relative success (or lack thereof) of modeling outputs/labels/tasks together. In exploring what it means for one label to be `dependent' on another, we take a path through some old and some new areas of the literature. We come across interesting results which we then take into the area of transfer learning, to challenge the long-held assumption that the source task must be similar to the target task.

#### Personal musings on semantic models and verification of programs, hybrid, concurrent and distributed systems

**Eric Goubault**(Cosynus team)

20 May 2021.

video

In semantics and verification, the end justifies any means. Applying this old saying to the letter, program and systems modeling and verification have produced an enormous amount of carefully crafted mathematics over the last 50 years or so. Being approximately of the same age, it seemed like a good idea for me to try to describe some of the methods I have participated in developing, within the Cosynus group at LIX, based on both numerical and symbolic methods and their interplay.

On the numerical mathematics side, I will describe some of the abstract interpretation/set-based methods we designed; on the symbolic side of things, I will pick up a few examples of algebraic (mostly categorical and higher-categorical), logical (varieties of temporal and epistemic logics) and geometric (invariants of topological and directed topological spaces). These are in fact all inter-related when you take the right categorical perspective, but this may well go too far for a general presentation.

These mathematics constitute a Swiss Army knife for dealing with a varieties of applications: checking control systems (for drones and swarms of drones), possibly including learning-based components (neural networks for perception and control), solving computability and complexity problems for diverse models of computation (rewriting systems, distributed systems etc.) and control and motion planning algorithms, expressing correctness for concurrent data structures etc. If time permits, I will direct the audience to some of these applications whenever possible.

#### Factoring integers with oracles

**Francois Morain**(Grace team)

15 April 2021.

video

The famous RSA cryptosystem is in danger if factoring integers is easy. In the classical world, factoring is considered difficult, contrary to the quantum world, where it can be done in polynomial time using Shor's algorithm. Yet, no efficient implementation of this algorithm is known and we have some time to improve our understanding of the classical difficulty of factoring before this existed. One way to look at this is to add some knowledge (given as an oracle) about the factors we are looking for. Some of these can come from real bad generation of the primes used in RSA, for instance revealing some fraction of their bits. Other oracles are associated with arithmetic functions, such as Euler's totient function. The aim of this talk is to give a survey of these oracles and give new results obtained with G. Renault and B. Smith.

#### Comparative prediction of RNA structure

**Sebastian Will**(Amibio team)

18 March 2021.

video

In this talk, I am going to sketch a subjective history of comparative RNA structure prediction. For this purpose, I describe exact combinatorial algorithms, which combine physics-based models with evolutionary information. The scientific interest in RNAs, these molecular key players of life, is ever growing, since massive sequencing revealed their numerous, essential roles. More recently, RNAs received even more popular attention: RNA is the genomic material of Corona viruses and RNA-based Covid19 vaccines became especially relevant.

Following the demands of biological research, computer science had a large impact on RNA research by studying RNA secondary structures as combinatorial objects of context-free nature. This perspective enabled efficient, exact optimization algorithms. In an interdisciplinary endeavor, such methods moreover integrate different sources of information to predict RNA structures reliably. On the one hand, chemistry and physics provide us with accurate thermodynamic models that allow predicting energetically optimal structures for an RNA of a given sequence. On the other hand, one can improve predictions due to picking up imprints of evolutionary selective pressure.

In cases where the evolutionary relation between several RNAs is already known, predicting structures with evolutionary support is not more complex than predicting single structures. However, when it is not known a priori, this evolutionary relation can be uncovered reliably only by simultaneously predicting the RNA structures. While the first algorithms for this central task of "simultaneous folding and alignment" (SFA) were prohibitively complex, I achieved a major break through by exploiting the sparsity of the problem. I am presenting several generations of combinatorial SFA algorithms and explain their controlled use of sparsity. Finally, I will discuss results from biological applications as well as exciting related open problems.

#### Algorithms for exploring the structure of differential equations

**Gleb Pogudin**(Max team)

11 February 2021.

video

Differential equations are one of the standard tools for describing models in the natural sciences. Traditionally, methods to analyze such models come from analysis and numeric computing. One the other hand, the equations themselves are symbolic expressions, and one can use algorithms from symbolic computation to deal with them. Interestingly, there is a number of important questions about the structural properties of systems of differential equations which can be answered this way. In the talk, I will describe two such questions and recent algorithms and software implementations to answer them:

Q1 (identifiability): Assume we have a model described by a system of differential equations with unknown parameters, and the experimental setup allows us to measure only few of the variables. How to compute the set of parameter and/or their combinations whose values can be inferred if we collect enough data?

Q2 (exact reduction): Assume we have a large system of differential equations, and we are really interested in the behavior of only few of the components of the solution. Can we automatically produce a smaller system with produces exactly the same behavior? How can we make such an algorithm to preserve the interpretability of the model after the reduction?

For both these questions, polytechnique students had made some interesting contributions recently, and I will highlight them.

#### Optimizing Big Data Computations: Queries and Algebra in a Single Framework

**Ioana Manolescu**(Cedar team)

21 January 2021.

video

The database management industry is worth $55 bn according to a recent Gartner report, and its importance is growing in particular as more
platforms specialize in very large scale data management in cloud platforms.
At the core of database management systems (DBMS)'s success lie declarativeness and its twin, automatic performance-oriented optimization.
This has enabled users to specify what computation they want executed over the data, not how to execute it. Instead, a dedicated DBMS module
called query optimizer enumerates alternative evaluation methods and, with a help of a computational cost model, selects the one estimated to
be the more efficient.

For data stored in relational tables, optimizers have, since 1970, helped launch a booming industry, whose products are
at work several times in the average day of everyone in a modern society. More recently, novel systems, in particular the so-called NoSQL
stores, were developed, each with specific performance advantages, and going beyond tabular data to XML or JSON documents, key-value
stores etc. In parallel, the developing interest in machine learning on Big Data has lead to hybrid workloads, which mix database-style queries
(fundamentally, logical formulas) and ML-specific operations (complex matrix computations). These developments have complexified the
landscape of modern Big Data management and the life of developers, since computations split across systems are often executed "as-such"
and do not benefit from automatic optimization.

I will describe Estocada, a project started in 2016 with the purpose of providing a unified optimization framework (a) for queries specified
across a large variety of data models, and (b) for workloads mixing queries with ML computations. At the heart of Estocada lies an old
powerful tool in database research, namely the chase. I will explain Estocada's approach for uniformly modeling heterogeneous data models,
including numerical matrices, in a relational model with constraints, and how it leverages a modern variant of the chase to automatically rewrite
computations spanning across heterogeneous data models and across systems such as Spark, SystemML, TensorFlow and SparkSQL, as well
as relational databases an document stores such as MongoDB.

This is joint work with: Rana Al-Otaibi (UCSD), Damian Bursztyn (Inria), Bogdan Cautis (U. Paris-Saclay), Alin Deutsch (UCSD), Stamatis Zampetakis (Inria).

#### Computing with Ordinary Differential Equations

**Olivier Bournez**(AlCo team)

17 December 2020.

video

Computability, complexity, programming, continuous time computations, old and recent analog models of computations:
Why ordinary differential equations are fun and a pleasant way to do computer science in 2020.

Olivier et ses co-auteurs François Fages, Guillaume Le Guludec et
Amaury Pouly, ont reçu le prix 2019 du journal La Recherche pour
leurs travaux sur le modèle de calcul analogique et en particulier
l'article Strong Turing Completeness of Continuous Chemical Reaction Networks and Compilation of Mixed Analog-Digital Programs.
Voir aussi la-revanche-du-calcul-analogique.

#### Problèmes de Consensus

**Bernadette Charron-Bost**(Comete team)

19 November 2020.

video

Les problemes de consensus apparaissent dans un grand nombre de systemes multi-agents, qu'ils soient naturels ou articiels : les agents doivent, par exemple, se coordonner an de se synchroniser ou encore adopter un meme etat memoire. Dans tout probleme de consensus, les agents possedent chacun une variable de sortie et doivent nir par s'accorder sur une valeur commune pour toutes ces variables. Au dela de cette clause commune d'accord, les problemes de consensus dierent essentiellement de par la propriete de stabilite exigee pour les variables de sortie. Ainsi peut-on demander a chaque agent de prendre une decision irrevocable (correspondant au cas ou les variables de sortie sont a ecriture unique). On peut aaiblir cette condition en autorisant les agents a modier leur decision un nombre ni de fois. Ou meme leur permettre de changer d'avis inniment souvent, auquel cas les variables de sortie doivent simplement converger vers une valeur de sortie limite commune. On parle ainsi de consensus irrevocable, de consensus stabilisant ou de consensus asymptotique. J'examinerai sous quelle forme ces dierents problemes de consensus apparaissent selon le type d'applications considerees : alors que le consensus irrevocable est requis dans les techniques de replication pour la tolerance aux fautes, seul un consensus asymptotique est necessaire pour nombre de problemes en controle distribue, en optimisation distribuee ou encore pour la modelisation de phenomenes naturels de synchronisation (e.g., nuees d'oiseaux, scintillement de lucioles). Quant au consensus de Nakamoto au cur de la technologie blockchain, sa clause de stabilite se situe entre celle du consensus irrevocable et celle du consensus stabilisant dans la hierarchie decrite plus haut. Dans une seconde partie, je presenterai les resultats majeurs de calculabilite et de complexite pour ces dierents problemes de consensus : le theoreme de Fischer, Lynch et Paterson (FLP) conduisant a la question "was e-mail a mistake?" posee dans un numero du New Yorker l'an dernier, le theoreme dit des "generaux byzantins" de Shostak, Pease et Lamport et le resultat plus recent mais tout aussi spectaculaire de Moreau, prouvant la superiorite des interactions symetriques. J'esquisserai les techniques tres variees mises en oeuvre pour etablir ces dierents resultats et expliquerai comment mon travail se situe et s'articule autour de ces resultats centraux.

Bernadette a reçu en 2019 le Grand prix Huy Duong Bui, décerné par l’Académie des Sciences.

#### Creative Visual Simulation : Combining User control, Knowledge & Learning

**Marie-Paule Cani**(Vista team)

15 October 2020.

video

While the use of digital models and simulation has already spread in many disciplines, recent advances in Computer Graphics open the way to a much lighter ways of creating 3D contents. In this talk, I'll show how the expressive modeling paradigm - namely, the combination of graphical models embedding knowledge with gestural interfaces such as sketching, sculpting or copy-pasting - can allow seamless 3D modeling, progressive refinement, and animation of a variety of models, from individual shapes to full, animated 3D worlds. We discuss how prior knowledge can be enhanced by learning from examples, the later being possibly injected on the fly during a modeling session. As we show through examples, this methodology opens the way to seamless creation and interaction of scientists and engineers with the mental models of their object of study, in areas as different as soft products manufacturing, medicine, plants and ecosystem simulation, or geology.

Marie-Paule a été élue à l’Académie des Sciences en décembre 2019.