Abstract:

Stability in machine learning (ML) is important for ensuring consistent performance and enhancing interpretability. For example, in healthcare, patient mortality risk prediction models may change upon retraining with new batches of data, but are likely to do so in a “smooth” way; in sustainability, the same holds true for energy consumption prediction over consecutive hours of the day or air quality prediction over adjacent spatial areas. Abrupt changes in a model’s structure or the resulting analytical insights can lead to hesitation for adoption. In this talk, I will present my research on stable ML through the lens of industry collaborations in healthcare and sustainability. First, I will introduce the framework of slowly varying regression under sparsity, allowing sparse regression models to exhibit controlled variations under some temporal, spatial, or general graph-based structure. Assessing stability in decision trees presents new challenges compared to regression models: to address this, I will propose a novel distance metric for decision trees and use it to determine a tree’s level of stability. Finally, I will present a model-agnostic framework to stabilize interpretable models’ structures or black-box models’ insights upon retraining with new data. We have tested the proposed methodologies on numerous real-world case studies and have shown that a controlled—and often negligible—decrease in predictive power can significantly improve the models’ stability and interpretability.

Bio:
Vassilis Digalakis Jr. is an Assistant Professor of Operations Management at HEC Paris. He completed his PhD in operations research at MIT in 2023. His research addresses the gap in analytics and ML adoption in high-stakes applications like healthcare and sustainability, where the lack of transparency in decision-making is a barrier. Leveraging tools from optimization, he develops “trustworthy” analytics and ML methodologies—ensuring that models exhibit characteristics such as interpretability, stability and robustness, fairness, and privacy. He has collaborated, among others, with OCP, the world’s largest producer of phosphate products, to develop a robust optimization framework guiding their $2Bn investment in renewable energy, and with FEMA to help them fairly decide on locations for COVID-19 mass vaccination centers. His research has been published, among others, in Operations Research and M&SOM, and has earned awards, including the 2023 Harold Kuhn Award, finalist in the 2023 M&SOM Practice-Based Research Competition, and the 2021 INFORMS Pierskalla Award.