Projects

OpenLLM (2024–2026)

OpenLLM aims to advance the state of the art in pragmatic and resource-efficient LLM development. The project explores the potential of smaller, specialized models (~1.5B parameters) trained on high-quality data, demonstrating that they can compete with much larger systems.

ANR LLM4ALL (2024–2027)

LLM4ALL develops open, updatable, and efficient multilingual language models. DaSciM contributes to low-cost training and inference techniques, such as quantization and transfer learning. The project applies these models to real-world tasks like meeting summarization and hospital emergency call understanding.

ANR HELAS Chair (2020–2024)

HELAS explores advanced deep learning for graphs and NLP. DaSciM led work on hybrid neural architectures combining graph-based structures with language data, enabling more expressive and interpretable models for knowledge extraction, recommendation, and search.

ANR XCOVIF (2020–2021)

XCOVIF tackled predictive modeling during the COVID-19 pandemic. DaSciM developed models leveraging mobility data, social media, and language trends to anticipate epidemic dynamics and inform public health interventions.

ANR SUMRE

SUMRE addresses automatic summarization of multi-party meetings. The DaSciM team focuses on hybrid summarization techniques combining extractive and abstractive methods, with attention to dialogue structure and discourse relations to generate high-quality meeting notes.

ANR Esigma (2018–2021)

Esigma developed scalable and interpretable graph mining techniques. DaSciM worked on mining meaningful patterns and anomalies from large-scale graphs, applicable in scientific data, social networks, and security domains.

Linto (2018–2021) – BPI-France

Linto created intelligent meeting assistants capable of understanding, summarizing, and recommending content. DaSciM provided the NLP backbone, developing the summarization models and real-time recommendations enhancements.

OpenPaas (2016–2019) – BPI-France

This project extended the OpenPaas collaborative platform with automated summarization services for meetings. DaSciM’s contribution centered on designing extractive and abstractive models tailored to business and collaborative environments.

AXA Chair (2015–2018)

This chair investigated data science applications in the insurance industry. DaSciM designed machine learning pipelines for fraud detection, client segmentation, and policy optimization, with an emphasis on interpretability and fairness in data-driven decision-making.