ANR-Project ESIGMA (ANR-17-CE23-0010)

Participants: LIX, LAMSADE, LIRMM

Title: Efficiency and Structure in Graph Mining Applications

Acronym: ESIGMA

Project coordinator: Michalis Vazirgiannis, LIX (Laboratoire d’Informatique de l’École polytechnique)

Partners’ intitutions: LAMSADE (CNRS/Univ. Paris-Dauphine), LIRMM (CNRS/Univ. Montpellier), LIX (Laboratoire d’Informatique de l’École polytechnique)

Project duration: 48 Months

Starting Date: 01/04/2018

Summary: As a general modelling tool, graphs are gaining increasing attention. They capture the interactions of entities in an easily comprehensible way and yet provide a rich model of the underlying structure. As information technology scrapes up ever more data in diverse areas of human activities, it becomes vital to discover meaningful information from the massive amounts of available data. Data sets can be naturally depicted as graphs, and characteristic examples include web graphs, internet topologies, social networks, and text corpora. This is the central theme of graph mining, whose applications can be found in many areas such as computer vision, bioinformatics, sociology, and cognitive science. In the meanwhile, graph theory and algorithm design with provable bounds, as a field of theoretical com- puter science, witnessed considerable advances during the last decades. Most of interesting computational problems are intractable and thus we cannot expect to have algorithms which, ideally, solve such problems both efficiently and exactly. Approximation and parameterized algorithms are prominent frameworks for designing algorithms with (mathematically) provable bounds on running time and performance guarantee. Modern graph theory offers rich results, from which algorithm design has largely benefited as well. Graph data that arise in real-world applications are often highly structured. Consequently, to learn and analyze this structure is an important issue when dealing with such data sets, especially when the graph data are huge. This project aims to forge a structural viewpoint for handling graph data so that algorithms can be de- signed and analyzed on a structure-aware principle rather than ad-hoc approaches with empirical warrant. The novelty of this project is that, to this aim, we deploy knowledge from graph theory and theoretical algorithm design/analysis frameworks. In particular, we will use graph theory as a guideline to navigate the various struc- tural aspects of graph data, and take advantage of theoretical algorithms analysis in order to implement/discern such structure embedded in algorithm design. In this project, we shall study diverse facets of structure in graph data, which involve (a) their global structural aspects (b) their structure-aware succinct representation, and (c) structural resemblance between graphs. We will also tackle topics that are requested for concrete applications such as identifying influential nodes in information spreading process and detecting an event on social media stream, and examine structure emerging in the relevant contexts. Such application-driven study of structure can benefit from methodologies as in (a), (b), and (c), but also would extensively require data- and instance-specific knowledge.

This project inherently calls for interdisciplinary collaboration. Our consortium comprises strong expertise in domains pertinent to the goal and methodologies of this project, including machine learning/graph mining, bioinformatics, structural graph theory, algorithms design and analysis.

Keywords: Machine learning for graphs, Graph mining, Structural graph theory, Design and analysis of graph algorithms.