Title | : | Towards Unsupervised Metric Learning: A Riemannian Take |
Speaker | : | Ujjal Kr Dutta (IITM) |
Details | : | Wed, 18 Sep, 2019 3:00 PM @ AM Turing Hall |
Abstract: | : | Metric learning aims at obtaining an embedding of data such that similar examples are grouped together, while pushing away dissimilar ones. In any challenging machine learning problems, for example, zero-shot learning, extreme classification, and fine-grained visual categorization, metric learning is the preferred machinery of choice. However, despite their significant success, state-of-the-art supervised metric learning approaches require huge number of labeled examples for training. In many applications, obtaining manual annotations may be infeasible either due to the large size of the dataset, or nature of the task. This necessitates the need for unsupervised metric learning without making use of class labels. As our first contribution, we employ a graph-based clustering approach to obtain a partitioning of data, and hence pseudo-labels. These pseudo-labels are used to form a triplet set, such that each triplet consists of an anchor, a positive, and a negative example. The anchor and positive are semantically similar, while the negative is dissimilar to both. This triplet set provides weak supervision for metric learning. However, instead of naively using the triplets obtained in an unsupervised manner, we scale the loss associated with a triplet using a weight function. We utilize a probabilistic notion to impose a geometric constraint on a triplet, for learning the associated embedding. Due to the nature of the joint search space of the parameters of the weight function and the embedding, we employ optimization on a Riemannian product manifold. Our approach, which we name as Reweighted Probabilistic unsupervised eMbedding Learning (RPML), obtains competitive performance with both supervised and unsupervised methods for learning metrics and embeddings. We additionally formulate a similar approach that uses tuples of examples instead of triplets, for learning a similarity metric. We call it as N-pair loss based Unsupervised Metric Learning (NUML). Lastly, we further propose a novel, unsupervised approach to learn an embedding that induces a metric. We sample random triplets from the unlabeled data, and associate a hypothetical label that represents all possible semantic permutations of examples present in the triplet. Considering each permutation, we generate synthetic triplets or pairs, that are used to learn an embedding. These synthetic constraints are generated using an adversarial training principle. We again make use of Riemannian product manifold based optimization to learn our parameters. We call our approach as Synthetic Unsupervised pseudo Metric Learning (SUML). |