Demo "Ease.ml/snoopy in Action: Towards Automatic Feasibility Analysis for Machine Learning Application Development" accepted at VLDB 2020

The following Demo has been accepted at the 46th International Conference on Very Large Data Bases (VLDB 2020): 

"Ease.ml/snoopy in Action: Towards Automatic Feasibility Analysis for Machine Learning Application Development" by Cedric Renggli (ETH Zurich), Luka Rimanic (ETH Zurich), Luka Kolar ( ETH Zurich), Wentao Wu (Microsoft Research), Ce Zhang (ETH).

Abstract:

We demonstrate ease.ml/snoopy, a data analytics system that performs feasibility analysis for machine learning (ML) applications before they are developed. Given a performance target of an ML application (e.g., accuracy above 0.95), ease.ml/snoopy provides a decisive answer to ML developers regarding whether the target is achievable or not. We formulate the feasibility analysis problem as an instance of Bayes error estimation. That is, for a data (distribution) on which the ML application should be performed, ease.ml/snoopy provides an estimate of the Bayes error -- the minimum error rate that can be achieved by any classifier. It is well-known that estimating the Bayes error is a notoriously hard task. In ease.ml/snoopy we explore and employ estimators based on the combination of (1) nearest neighbor (NN) classifiers and (2) pre-trained feature transformations. To the best of our knowledge, this is the first work on Bayes error estimation that combines (1) and (2). In today's cost-driven business world, feasibility of an ML project is an ideal piece of information for ML application developers -- ease.ml/snoopy plays the role of a reliable ''consultant''.