• Ph.D.September 2015-present

    Ph.D. Student in Representations, Regularization and Visualization for text and graph data

    École Polytechnique, France

  • MSc2014-2015

    Master M2 in Math, Computer Vision and Machine Learning (MVA)

    École Normale Superieure de Cachan, France

  • BScMay 2013

    Bachelor in Computer Science (4-year curriculum)

    Athens University of Ecomonics and Business, Greece

© Konstantinos Skianis, powered by Bootstrap, last updated:

  • Natural Language Processing
  • Graph Mining
  • Machine Learning
  • Data Science


Data Management for the World Wealth & Income Database

Christos Giatsidis, Antonis Skandalis, Konstantinos Skianis, Michalis Vazirgiannis
Facundo Alvaredo, Lucas Chancel, Thomas Piketty, Emmanuel Saez, Gabriel Zucman
Ben Grillet, Francois Prosper, Brice Terdjman, Anthony Veyssiere
Poster ParisBD 2017, Paris, France


The world wealth & income database ( is a project that publishes data related to inequality over the world. It is an open portal that distributes time series about economic concepts such as wealth, income, etc., where users can select time series of interest based on multi-attribute queries. Most of the components of the project are hosted on the cloud using Amazon Web Services. The data management functionality of the project consists of two major parts: a) a modern relational database that uses state of the art indexing techniques along with JSON features and b) a web API through which the database can be accessed and where some data transformations take place. To reduce latency across the globe the project is currently deployed in two sites (EU and US). Currently the database holds data for 319 geographical regions (countries, continents, states), about 150 combinations of attributes for each region over 50 years on average. The project is a joint collaboration between the World Inequality Lab at Paris School of Economics, DaSciM team at Laboratoire d’Informatique de l’X and WEDODATA.

SpreadViz: Analytics and Visualization of Spreading Processes in Social Networks

Konstantinos Skianis, Maria Evgenia G. Rossi, Fragkiskos D. Malliaros, Michalis Vazirgiannis
Demo Paper ICDM 2016, Barcelona, Spain


In this paper, we propose SpreadViz, a web tool for exploration and visualization of spreading properties in social networks. SpreadViz consists of three main modules, namely graph exploration and analytics, detection of influential nodes, and interactive visualization. More precisely, SpreadViz offers the following functionalities: (i) It computes and visualizes various centrality criteria towards understanding how the position of a node in the network affects its spreading properties; (ii) It offers a wide range of criteria for the detection of single and multiple influential nodes and comparison among them; (iii) It effectively visualizes the spread of influence in the network as well as the performance of each method. In our demonstration, we invite the audience to interact with SpreadViz, exploring, analyzing, and visualizing the spreading processes over various real-world social networks.

Regularizing Text Categorization with Clusters of Words

Konstantinos Skianis, Francois Rousseau, Michalis Vazirgiannis
Conference Paper EMNLP 2016, Austin, USA


Regularization is a critical step in any supervised learning problem and crucial for addressing not only overfitting, but also taking into account any prior knowledge we may have on the problem features and their relationships. In this paper we explore state-of-the-art structured regularizers for textual data and we propose novel ones based on topics from LSI and clusters from word2vec and graph-of-words document representation. We show that for text categorization our proposed regularizers are faster than the state-of-the-art ones while they improve classification accuracy.

GoWvis: A web application for Graph-of-Words-based text visualization and summarization

Antoine J.-P. Tixier, Konstantinos Skianis, Michalis Vazirgiannis
Demo Paper ACL 2016, Berlin, Germany


We introduce GoWvis, an interactive web application that represents any piece of text inputted by the user as a Graph-of-Words and leverages graph degeneracy and community detection to generate an extractive summary (keyphrases and paragraph) of the inputted text in an unsupervised fashion. The entire analysis can be fully customized via the tuning of many text preprocessing, graph building, and graph mining parameters. Our system is thus well suited to educational purposes, exploration and early research experiments. The new summarization strategy we propose also shows promise.

Graph-Based Term Weighting for Text Categorization

Fragkiskos Malliaros, Konstantinos Skianis
Conference Paper ASONAM 2015, Someris workshop, Paris, France


Text categorization is an important task with plenty of applications, ranging from sentiment analysis to automated news classification. In this paper, we introduce a novel graphbased approach for text categorization. Contrary to the traditional Bag-of-Words model for document representation, we consider a model in which each document is represented by a graph that encodes relationships between the different terms. The importance of a term to a document is indicated using graphtheoretic node centrality criteria. The proposed weighting scheme is able to meaningfully capture the relationships between the terms that co-occur in a document, creating feature vectors that can improve the categorization task. We perform experiments in well-known document collections, applying popular classification algorithms. Our preliminary results indicate that the proposed graph-based weighting mechanism is able to outperform existing frequency-based term weighting criteria, under appropriate parameter setting.

Learning for Text and Graph Data - 2017, Learning for Text and Graph Data (2016-2017)

Licence, Master M2

The courses aim at providing an introduction to advanced machine learning and combinatorial methods aiming at large scale text and graph data. The courses syllabus included:

  • Advanced graph kernels and classification,clustering / community mining (Louvain, modularity, degeneracy)
  • Influence maximization models (SIR/SIS, LT, IC,…), degeneracy based spreaders selection
  • Graph of words advanced topics: tw-icw, graph kernels for document similarity, graph based regularization for text classification
  • Word embeddings, Unsupervised document classification with the Word Mover’s Distance, WMD vs cosine similarity
  • Deep learning for NLP, Supervised document classification (TF-IDF vs TW-IDF)
  • Keyword extraction for summarization: Graph based keyword extraction, summarization (off line, online), Filipova’s word graph for multi-sentence fusion

  • April 2016

    2nd place, Fintech Crowdhackathon 2016

    National Bank of Greece

    Our team (RSK project) is the 2nd winner in the Fintech Crowdhackathon organized by the National Bank of Greece. We made a platform to detect fraud e-transactions based on Deep Learning. You can find more info here.

  • January 2015

    2nd place, Dreem challenge 2015

    Inclass Kaggle

    Trying to analyze dreams. During deep sleep, crucial mechanisms occur: memory consolidation, cellular regeneration, growth hormone release or biologic clock reset. Lacking deep sleep impairs memory, focus and judgment during work. DREEM introduces a way to increase the duration and quality of deep sleep to ensure optimal performances.

  • May 2013

    6th place, Data Mining Cup 2013


    Our team from the Department of Informatics, consisted of undergraduate students G.Papoutsakis, G.Zografos, G.Theofilis, myself and Phd students M. Karkali and S. Thomaidou, took the 6th place in the Data Mining Cup 2013 competition.

    The participations reached 99 out of 77 universities all over the world.

  • April 2013

    Top 25%, Employee Access Challenge 2013


    The objective of this competition is to build a model, learned using historical data, that will determine an employee's access needs, such that manual access transactions (grants and revokes) are minimized as the employee's attributes change over time. The model will take an employee's role information and a resource code and will return whether or not access should be granted.

Laboratoire d'Informatique (LIX), École Polytechnique
Batiment Alan Turing, 1 Rue Honore d'Estienne d'Orves
Campus de l'Ecole Polytechnique
91120 Palaiseau, France
Office 1071
  • kskianis at
  • rob.cs.aueb at
  • kostas.skianis
  • #kskianis
  • My LinkedIn profile