在本文中,我们在下闭合的凸套装上重新审视了在线非单调的DR-Submodular Mavimivel问题,该凸套装在机器学习,经济学和操作研究的领域中找到了广泛的现实世界应用。首先,我们以$ o(\ sqrt {t})$的价格呈现元MFW算法,价格为$ t^{3/2} $每回合。据我们所知,Meta-MFW是第一个获得$ 1/e $ - regret $ o(\ sqrt {t})$的算法放。此外,与ODC算法\ citep {thang2021online}形成鲜明对比的是,meta-mfw依赖于简单的在线线性甲骨文而无需离散化,提升或舍入操作。考虑到实用限制,我们然后提出了单声道-MFW算法,该算法将每个功能的随机梯度评估从$ t^{3/2} $减少到1,并实现$ 1/e $ -e $ -e-regret BOND $ O(t ^{4/5})$。接下来,我们将Mono-MFW扩展到Bandit设置,并提出Bandit-MFW算法,该算法获得了$ 1/e $ - regret键的$ O(t^{8/9})$。据我们所知,Mono-MFW和Bandit-MFW是第一个探索在线非占用dr dr-submodumarmimization thy pownlosed convex set的sumblinear-regret算法,可以探索单发和强盗设置。最后,我们对合成数据集和现实数据集进行了数值实验,以验证我们方法的有效性。
translated by 谷歌翻译
价值功能的空间是强化学习中的一个基本概念。表征其几何特性可以提供优化和表示的见解。现有作品主要关注马尔可夫决策过程(MDP)的价值空间。在本文中,我们研究了考虑过渡不确定性的更通用的稳健MDP(RMDP)设置的稳健价值空间的几何形状。具体而言,由于我们发现很难直接适应RMDP的先验方法,因此我们从重新审视非持续的情况开始,并引入了一种新的视角,使我们能够以类似的方式表征非稳定和健壮的价值空间。这种观点的关键是将价值空间以州的方式分解成超曲面的工会。通过我们的分析,我们表明稳健的值空间由一组圆锥形超曲面确定,每组都包含所有在一个状态上一致的策略的可靠值。此外,我们发现在不确定性集中仅采用极端点足以确定可靠的值空间。最后,我们讨论了有关强大价值空间的其他一些方面,包括其对多个州的非跨性别和政策协议。
translated by 谷歌翻译
类增量学习(CIL)旨在以相位逐相的方式学习多级分类器,其中仅在每个阶段提供类的子集的数据。以前的作品主要专注于初始之后减轻阶段的遗忘。但是,我们发现,在初始阶段改善CIL也是一个有希望的方向。具体而言,我们通过实验表明,在初始阶段直接鼓励CIL学习者将类似的表示类似的表示,因为在所有类别上训练的模型可以大大提升CIL性能。由此激励,我们研究了NA \“IVERY训练初始阶段模型和Oracle模型之间的差异。具体来说,由于这两个模型之间的一个主要区别是培训类的数量,我们研究了这种差异如何影响模型表示。我们发现,通过较少的培训类,每个班级的数据表示位于一个漫长而狭窄的地区;通过更多的培训类,每个阶级的陈述更统一地散射。灵感来自这种观察,我们提出了课堂上的去相关性(CWD)有效地规范了每个类的表示,以更统一地散射,从而模拟与所有类联合训练的模型(即Oracle模型)。我们的CWD易于实施,易于插入现有方法。各种各样的实验基准数据集显示CWD一直在且显着提高现有最先进方法的性能约为1 \%至3 \%。代码将被释放。
translated by 谷歌翻译
将钢筋学习(RL)扩展到推荐系统(RS)是有希望的,因为最大化RL代理的预期累积奖励达到了RS的目标,即提高客户的长期满意度。该目标的关键方法是离线RL,旨在从记录数据中学习政策。但是,商业RS中的高维操作空间和非平稳动态加剧了分配转移问题,这使得将离线RL方法应用于Rs是具有挑战性的。为了减轻从静态轨迹提取RL策略的动作分配转移问题,我们提出了基于不确定性的离线RL算法的价值惩罚Q学习(VPQ)。它通过不确定性的权重来惩罚回归目标中不稳定的Q值,而无需估计行为政策,适用于拥有大量项目的RS。我们从Q-功能合奏的方差中得出惩罚权重。为了减轻测试时间的分配转移问题,我们进一步介绍了评论家框架,以将拟议方法与经典RS模型相结合。在两个现实世界数据集上进行的广泛实验表明,该方法可以用作现有RS模型的增益插件。
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Text clustering and topic extraction are two important tasks in text mining. Usually, these two tasks are performed separately. For topic extraction to facilitate clustering, we can first project texts into a topic space and then perform a clustering algorithm to obtain clusters. To promote topic extraction by clustering, we can first obtain clusters with a clustering algorithm and then extract cluster-specific topics. However, this naive strategy ignores the fact that text clustering and topic extraction are strongly correlated and follow a chicken-and-egg relationship. Performing them separately fails to make them mutually benefit each other to achieve the best overall performance. In this paper, we propose an unsupervised text clustering and topic extraction framework (ClusTop) which integrates text clustering and topic extraction into a unified framework and can achieve high-quality clustering result and extract topics from each cluster simultaneously. Our framework includes four components: enhanced language model training, dimensionality reduction, clustering and topic extraction, where the enhanced language model can be viewed as a bridge between clustering and topic extraction. On one hand, it provides text embeddings with a strong cluster structure which facilitates effective text clustering; on the other hand, it pays high attention on the topic related words for topic extraction because of its self-attention architecture. Moreover, the training of enhanced language model is unsupervised. Experiments on two datasets demonstrate the effectiveness of our framework and provide benchmarks for different model combinations in this framework.
translated by 谷歌翻译
An increasing number of public datasets have shown a marked clinical impact on assessing anatomical structures. However, each of the datasets is small, partially labeled, and rarely investigates severe tumor subjects. Moreover, current models are limited to segmenting specific organs/tumors, which can not be extended to novel domains and classes. To tackle these limitations, we introduce embedding learned from Contrastive Language-Image Pre-training (CLIP) to segmentation models, dubbed the CLIP-Driven Universal Model. The Universal Model can better segment 25 organs and 6 types of tumors by exploiting the semantic relationship between abdominal structures. The model is developed from an assembly of 14 datasets with 3,410 CT scans and evaluated on 6,162 external CT scans from 3 datasets. We rank first on the public leaderboard of the Medical Segmentation Decathlon (MSD) and achieve the state-of-the-art results on Beyond The Cranial Vault (BTCV). Compared with dataset-specific models, the Universal Model is computationally more efficient (6x faster), generalizes better to CT scans from varying sites, and shows stronger transfer learning performance on novel tasks. The design of CLIP embedding enables the Universal Model to be easily extended to new classes without catastrophically forgetting the previously learned classes.
translated by 谷歌翻译
Recent advances in self-supervised learning (SSL) in computer vision are primarily comparative, whose goal is to preserve invariant and discriminative semantics in latent representations by comparing siamese image views. However, the preserved high-level semantics do not contain enough local information, which is vital in medical image analysis (e.g., image-based diagnosis and tumor segmentation). To mitigate the locality problem of comparative SSL, we propose to incorporate the task of pixel restoration for explicitly encoding more pixel-level information into high-level semantics. We also address the preservation of scale information, a powerful tool in aiding image understanding but has not drawn much attention in SSL. The resulting framework can be formulated as a multi-task optimization problem on the feature pyramid. Specifically, we conduct multi-scale pixel restoration and siamese feature comparison in the pyramid. In addition, we propose non-skip U-Net to build the feature pyramid and develop sub-crop to replace multi-crop in 3D medical imaging. The proposed unified SSL framework (PCRLv2) surpasses its self-supervised counterparts on various tasks, including brain tumor segmentation (BraTS 2018), chest pathology identification (ChestX-ray, CheXpert), pulmonary nodule detection (LUNA), and abdominal organ segmentation (LiTS), sometimes outperforming them by large margins with limited annotations.
translated by 谷歌翻译
Due to their ability to offer more comprehensive information than data from a single view, multi-view (multi-source, multi-modal, multi-perspective, etc.) data are being used more frequently in remote sensing tasks. However, as the number of views grows, the issue of data quality becomes more apparent, limiting the potential benefits of multi-view data. Although recent deep neural network (DNN) based models can learn the weight of data adaptively, a lack of research on explicitly quantifying the data quality of each view when fusing them renders these models inexplicable, performing unsatisfactorily and inflexible in downstream remote sensing tasks. To fill this gap, in this paper, evidential deep learning is introduced to the task of aerial-ground dual-view remote sensing scene classification to model the credibility of each view. Specifically, the theory of evidence is used to calculate an uncertainty value which describes the decision-making risk of each view. Based on this uncertainty, a novel decision-level fusion strategy is proposed to ensure that the view with lower risk obtains more weight, making the classification more credible. On two well-known, publicly available datasets of aerial-ground dual-view remote sensing images, the proposed approach achieves state-of-the-art results, demonstrating its effectiveness. The code and datasets of this article are available at the following address: https://github.com/gaopiaoliang/Evidential.
translated by 谷歌翻译