|Title||:||Statistical Optimality for General Performance Metrics in Machine Learning.|
|Speaker||:||Harish G Prasad (IBM Research)|
|Details||:||Tue, 31 Jan, 2017 12:00 PM @ BSB 361|
|Abstract:||:||Supervised learning is the task of learning a mapping from an input space to an output space given several examples of input-output pairs. For example, it might learn a classifier mapping MRI images to diagnoses given several such (MRI image, diagnosis) pairs.
A learning algorithm is said to be statistically consistent if it returns the optimal classifier in the limit of infinite data. Statistical consistency is a fundamental notion in supervised machine learning and therefore an important question is the design of consistent algorithms for various learning problems. While this has been well studied for binary classification and some other specific learning problems, the question of consistent algorithms for general multiclass learning problems remains open.
In this talk I will give brief overview of a new framework for analyzing statistical consistency, and illustrate it with an example application of it to the problem of hierarchical classification.
Hierarchical classification problems are multiclass supervised learning problems with a pre-defined hierarchy over the set of class labels, and predictions are penalized based on the tree distance to the truth. We show that the Bayes optimal classifier for this loss classifies an instance according to the deepest node in the hierarchy such that the total conditional probability of the sub-tree rooted at the node is greater than 0.5. We exploit this insight to develop new consistent algorithms for hierarchical classification, which make use of binary classification algorithms as a sub-routine.
Bio: Harish is currently a research Scientist at IBM Research labs, India, working on machine learning applied to document and image understanding. Harish completed his PhD and Master of Engineering from the department of Computer Science and automation at IISc, Bangalore. His areas of expertise/interest in research are statistical machine learning, learning theory, computational optimization, geometry and information theory.