Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-world tasks, each training sample is associated with not only multiple instances but also a candidate label set that contains one ground-truth label and some false positive labels. Specifically, at least one instance pertains to the ground-truth label while no instance belongs to the false positive labels. In this paper, we formalize such problems as multi-instance partial-label learning (MIPL). Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems since the former fail to disambiguate a candidate label set, and the latter cannot handle a multi-instance bag. To address these issues, a tailored algorithm named MIPLGP, i.e., Multi-Instance Partial-Label learning with Gaussian Processes, is proposed. MIPLGP first assigns each instance with a candidate label set in an augmented label space, then transforms the candidate label set into a logarithmic space to yield the disambiguated and continuous labels via an exclusive disambiguation strategy, and last induces a model based on the Gaussian processes. Experimental results on various datasets validate that MIPLGP is superior to well-established multi-instance learning and partial-label learning algorithms for solving MIPL problems. Our code and datasets will be made publicly available.
translated by 谷歌翻译