This is an old revision of the document!

Transverse projects

MAP/INF630 (3 ECTS): Students will work half a day a week on a transverse project, ie. a case study corresponding to a challenging question either raised by an industrial partner or by a researcher in the domain spanned by the graduate degree.

You can work on your own on one subject or you can work with another student on two subjects. To give you a subject, please fill in the following document by specifying another student you want to work with (if needed) and rank the subjects (1 being the most interesting, 11 being the least interesting). Please rank the 10 subjects before October 6.

If you have any questions, do not hesitate to contact,, or the supervisor of the project for specific questions. Nicolas will be in charge of accompanying you through the year with your project. You will have to write an intermediary report for December 17.

Enedis - contact DIAS Paul and NAJIBE Khalid

  • Business model project & AI - contact: Eric TEYSSEDRE -

  • AI for better customer relashionships - contact: Eric TEYSSEDRE -

Google - contact Damien Henry

* Google 1 : Automatic in-painting stories generation

Google Arts and Culture have access to a lot of Hi-resolution paintings. The goal of the project is to generate automatically a story from the painting. The story should be created as a video from features automatically detected in paintings and respect cinematographics rules. Input: an high resolution picture of a painting Output: a video Ref.:

* Google 2 : Augmented art-selfy

Could we render your own face within the art piece?

Could we warp the art piece so that it matches better your photo?

Idemia - contact LANNES Sarah

* Idemia 1 : Semi-supervised learning for object localization by image transformation. (contact Sarah LANNES

* Idemia 2 : Generative adversarial morphing model (contact Stephane GENTRIC

Inria - contact TBC

* Inria 1 : Title: Can we teach computers to draw and read plots? Contact Adrien Boursseau, Inria sophia-antipolis, Topic: Designing effective infographics, such as data plots, is a challenging task because complex information needs to be conveyed with simple 2D shapes (dots, lines, rectangles…). Conversely, understanding a plot requires making sense of individual shapes as well as their relationships. The goal of this project is to attempt to teach an AI agent to draw and read simple plots. We will take inspiration from methods originally designed for machine translation and image captioning, and cast the problem of generating a plot as the problem of translating a data table into an image, and the problem of interpreting a plot as translating an image into a data table. We hope to solve these two problems jointly such that the agent learns to generate plots that, once interpreted, minimize loss of information. GoJob - contact Julien Rialan Gojob est le leader français de l'intérim digital. L'ambition de Gojob est de hacker le chômage en apportant les meilleures technologies au marché de l’emploi. En 4 ans d'existence, Gojob a rassemblé plus de 70 000 intérimaires et 650 entreprises sur sa plateforme. L’équipe compte aujourd'hui plus de 50 personnes réparties entre les bureaux commerciaux à Paris et le centre de R&D à Aix-en-Provence et recrute de nouveaux talents pour accompagner le développement rapide du projet. * GoJob : Scoring Objectif : à partir des données disponibles par candidat de la plateforme d'interim Gojob (diplômes, expériences, localisation, références, types d'emplois recherchés) et par opportunités d'emploi (compétences, expériences, diplômes recherchés, mots clés) proposer différents scores de correspondance entre les opportunités d'emploi et les candidats. Expliciter leur fonctionnement. Décrire les avantages et les inconvénients de chaque score pour les deux parties. Contexte : Gojob dispose déjà d'un scoring simple basé sur des tags de métiers recherchés et la distance avec la ville de recherche. L'idée de ce projet est d'expérimenter des scoring plus complexes et avec des données plus difficiles à exploiter (interprétation des mots clés employés dans l'offre), données externes (par exemple sur les difficultés de recrutement dans la zone, le taux de chômage local, etc). Variante 1 : classer les offres pour lesquels un candidat donné a le plus de chance d'être accepté (recommandation de candidature). Variante 2 : classer les candidats pour lesquels une offre donnée a le plus de chance de donner lieu à candidature (recommandation de prospection candidat). Variante 3 : recommander les données pertinentes à collecter pour améliorer ce scoring. * GoJob : Yielding Objectif : proposer une approche pour faire évoluer les coûts administratifs de Gojob en fonction des périodes saisonnières d'emploi. L'outil pourra constituer en une aide à la décision pour nos commerciaux : en fonction de la difficulté rencontrée à trouver des candidats pour un type de poste, les coefficients de Gojob (utilisés pour faire payer nos services) pourront évoluer dans une fourchette. Le projet devra proposer une façon de savoir si une offre d'emploi sera compliquée à staffer ou non. Elle pourra s'appuyer sur des données externes. Contexte : le yielding est une pratique issue du milieu de l'aérien et de l'hôtellerie, consistant à faire évoluer le prix d'une prestation en fonction de la saison. L'interim étant fortement soumis à la saisonnalité (période des fêtes notamment), Gojob souhaite expérimenter l'apport de cette pratique à son modèle économique. * Ynsect : insect species identification To be defined ===== List of available projects for 2018-2019 ===== Enedis - contact Frédéric Boutaud * Dataposte: Détection automatique du type de matériel en place à partir de photos, pour l'aide à l'intervention lors des missions de maintenance du réseau. - contact Séverine MULATIER - Numerous pictures campaigns had been organized nationwide by Enedis inside secondary substations in order to collect informations about material (manufacturer, model….). During this summer, the first version of an algorithm have been developed to identify texts in label printings (OCR processing) for hight voltage cell unit, LV switchboard and fault detector. The aim of this project is to : - Identify if AI can permess us to identify label printings of transformers (more complexe than others materials). - Set up a deep learning process to classify a big amont of pictures - Join a transverse task force organised arround this topic * Predictive maintenance - contact: Eric TEYSSEDRE - Context. Enedis is deploying Linky communicating meter and the program will be ended by 2022. More than 13 million meters are yet installed. Data recorded in each Linky meter provide valuable information about events which appear on electricity grid, which is a new opportunity for Enedis. On the other hand, failures affect the low voltage network leading to power outages, about 40000 a year. Problem. The problem is the following: how to take advantage of new available data (Linky meter data, weather, network load,…) to detect anomalies on the electrical network and avoid power outages ? In other words, how to develop predictive maintenance to optimize our resources? Description. The idea is to use data registered by Linky meter (short outages, voltage excursions, surges) and other data considered relevant which need to be identified, in order to build an algorithm based on Artificial Intelligence (AI) allowing to predict and characterize failures. It comes specifically to search correlations between data available and network outages to define failures “signatures”. These signatures would be so recognized by the algorithm and would lead to recommendations for action on the field to correct the anomaly before the incident. More precisely, the study will focus on: - Quick benchmark of use of AI to predict failures in other electrical companies - Critical analysis of the Enedis works undertaken on the subject (method, algorithm,…) - Research of new correlations between some types of failures and data available - Design of an algorithm or improve existing algorithm for predicting failures The student would work in collaboration with Enedis data scientist who started a study on the subject. The student would work physically on the Enedis site of Nanterre with data scientist during the phase of data analysis and data processing (sensitive data). For other part of the study or to analyse non-sensitive data, he will have the possibility to work at Polytechnique. Depending on the first results, this study could lead to a project of several months in order to continue works engaged. * Customer relationship: Création de chatbot conviviaux pour répondre aux demandes des utilisateurs - contact: Richard BAVARIN - But : apprécier comment l'IA pourrait prendre en compte le volet “émotionnel” des appels clients reçus sur notre Centre d'Appel Dépannage. Des enregistrements réels seront utilisés pour l'apprentissage. Le premier volet de se projet consistera en un état de l'art sur le sujet pour évaluer une première faisabilité sur le cas d'usage proposé. Google contact Damien Henry, project Google Arts and Culture. * Automatic detection of Art Style in paintings: Il existe de nombreuses bases de données non structurée, pour lesquels il pourrait être interessant de detecter automatiquement des méta donnée, en particulier le mouvement artistique, l'auteur, etc… (voir ici, et ). * Image generation: La generation d'image grâce au ML est en plein essor avec de nombreuses applications possibles. La technique classique est basée sur des Generative Adversarial Network. Une technique plus récente et prometteuse est basée sur les Normalizing Flow. (GAN et GLOW). L'objectif de ce projet est de comparer plusieurs approches pour générer des images de visage à partir d'une base d'images d'apprentissage. Dans les deux cas, des données peuvent être trouvée ici. Idemia contact Stéphane Gentric * Semi-supervised learning for a localization task. (possible continuation in an internship, and eventually a CIFRE PhD) Using a DCNN (Deep Convolution neural Network), we want to learn the absolute position, scale and rotation of an object in an image. Standard methods rely on annotated data and are limited by the precision of those annotations. We want to study the feasibility and performance of a learning process without any annotations, using only the fact that when applying a given similarity to the image, the expected changes in position, scale and rotation are known. We will start with a toy problem and hopefully move on to real objects and more complex scenes. * Building an image-based algorithm selector for face recognition based on speed and performance of candidate algorithms (possible continuation in an internship). The increasingly ubiquitous presence of biometric solutions and face recognition in particular in everyday life requires Idemia to adapt its solutions for practical requirements, may they be memory space, speed or performances. Idemia has developed several solutions, but where global decisions can be made, they are far less efficient then tailoring such decisions to the complexity of each image, which allows for the best compromise between constraints such as speed and performances. We would like to build a DCNN (Deep Convolution neural Network) selector of the best suited solution to each input image. For this purpose, we will lend a coding/matching software suite capable of generating different options. Ynsect (start-up; not official partners yet) contact Arturo Escaroz Cetina * Conduite d’élevage 4.0 : Automated Insects Physiological Data Retrival from Insect Population pictures**. @Ynsect (, our insects are raised into trays of various size : Various pictures of them are taken regularly to perform quality control operations. We would like to enhance our data collection methods to get significant improvement on our insect population modeling tools. The objective of this project is to convert pictures into already know data of interest (visual computing) & to make data driven R&D into the discoveries of any observable Behavior patterns through pictures (AI). Already known data of interest can be picked in : insect Size, stage, density, color, number of rings, defects, behavior, Amount of feed / top layer description, population distribution pattern; etc. The project might include: data collection methods’ revision; Image characterization and classification; Pattern recognition & Predictive tools’; Any other methods tools that could be of interest and that we don’t know of yet !