The Cedar team has scheduled a seminar on January 21st by Benoît Groz.
Title: Exploring multidimensional data through skylines: computing and sampling skylines
Abstract: In everyday life, users often wish to order data items according to multiple criteria. As those multiple orders need not be correlated in general, there generally is no single best item. As a consequence, the set of pareto optima - the skyline of the set - has proved popular to identify the items that are most relevant for the criteria specified. However, filtering items through skylines has some limitations: computing skylines can be computationally intensive, and the skyline may not be very selective. We outline approaches that we developed to address those 2 issues:
- we propose algorithms to compute skylines efficiently in a
probabilistic setting where items can only be compared through a random oracle (a simplistic model for crowdsourcing scenarios).
- we investigate algorithms to sample skylines when we are given a set of
points in "R^d" together with a value "k", and we wish to compute a set of "k" points maximizing the volume dominated by the sample.