在无监督的域自适应(UDA)语义分割中,基于蒸馏的方法目前在性能上占主导地位。但是,蒸馏技术需要使多阶段的过程和许多培训技巧复杂化。在本文中,我们提出了一种简单而有效的方法,可以实现高级蒸馏方法的竞争性能。我们的核心思想是从边界和功能的观点充分探索目标域信息。首先,我们提出了一种新颖的混合策略,以产生具有地面标签的高质量目标域边界。与以前的作品中的源域边界不同,我们选择了高信心目标域区域,然后将其粘贴到源域图像中。这样的策略可以使用正确的标签在目标域(目标域对象区域的边缘)中生成对象边界。因此,可以通过学习混合样品来有效地捕获目标域的边界信息。其次,我们设计了多层对比损失,以改善目标域数据的表示,包括像素级和原型级对比度学习。通过结合两种建议的方法,可以提取更多的判别特征,并且可以更好地解决目标域的硬对象边界。对两个常用基准测试的实验结果(\ textit {i.e。},gta5 $ \ rightarrow $ cityScapes and synthia $ \ rightarrow $ cityScapes)表明,我们的方法在复杂的蒸馏方法上取得了竞争性能。值得注意的是,对于Synthia $ \ rightarrow $ CityScapes方案,我们的方法以$ 57.8 \%$ MIOU和$ 64.6 \%$ MIOU的16堂课和16堂课实现了最先进的性能。代码可在https://github.com/ljjcoder/ehtdi上找到。
translated by 谷歌翻译
具有隐式函数的单视RGB-D人重建通常以每点分类为例。具体而言,首先将相机视图中的一组3D位置投影到图像上,并随后针对每个3D位置提取相应的功能。然后,每个3D位置的特征用于独立分类,无论相应的3D点在观察到的对象内还是外部。此过程导致了亚最佳结果,因为仅通过提取的特征隐式地考虑了相邻位置的预测之间的相关性。为了获得更准确的结果,我们提出了占用平面(OPLANES)表示,该表示可以使单视RGB-D人类重建作为对平面上的占用预测,这些预测切成摄像机的视图。这种表示比体素电网提供了更大的灵活性,并使比每点分类更好地利用相关性。在具有挑战性的S3D数据上,我们观察一个基于Oplanes表示的简单分类器,以产生引人注目的结果,尤其是在由于其他对象和部分可见性引起的部分遮挡的困难情况下,这尚未通过先前的工作解决。
translated by 谷歌翻译
尽管通过卷积神经网络实现的光场超分辨率(LFSR)的最近进展,但由于4D LF数据的复杂性,灯场(LF)图像的相关信息尚未充分研究和利用。为了应对这种高维LF数据,大多数现有的LFSR方法采用将其分解成较低的尺寸并随后在分解的子空间上执行优化。然而,这些方法本质上是有限的,因为它们被忽略了分解操作的特性,并且仅利用了一组限量的LF子空间,最终未能全面提取时空角度并导致性能瓶颈。为了克服这些限制,在本文中,我们彻底发现了LF分解的潜力,并提出了一种新颖的分解核的概念。特别地,我们系统地将各种子空间的分解操作统一到一系列这样的分解核中,该分解核将其纳入我们所提出的分解内核网络(DKNET),用于全面的时空特征提取。与最先进的方法相比,所提出的DKNET经过实验验证以在2倍,3倍和4倍LFSR尺度中达到大量改进。为了进一步完善DKNet,在生产更多视觉上令人愉悦的LFSR结果,我们提出了一个LFVGG丢失来引导纹理增强的DKNet(TE-DKNet)来产生丰富的真实纹理,并显着提高LF图像的视觉质量。我们还通过利用LF材料识别来旨在客观地评估LFVGG损失所带来的感知增强的间接评估度量。
translated by 谷歌翻译
The objective of this paper is to learn dense 3D shape correspondence for topology-varying generic objects in an unsupervised manner. Conventional implicit functions estimate the occupancy of a 3D point given a shape latent code. Instead, our novel implicit function produces a probabilistic embedding to represent each 3D point in a part embedding space. Assuming the corresponding points are similar in the embedding space, we implement dense correspondence through an inverse function mapping from the part embedding vector to a corresponded 3D point. Both functions are jointly learned with several effective and uncertainty-aware loss functions to realize our assumption, together with the encoder generating the shape latent code. During inference, if a user selects an arbitrary point on the source shape, our algorithm can automatically generate a confidence score indicating whether there is a correspondence on the target shape, as well as the corresponding semantic point if there is one. Such a mechanism inherently benefits man-made objects with different part constitutions. The effectiveness of our approach is demonstrated through unsupervised 3D semantic correspondence and shape segmentation.
translated by 谷歌翻译
Machine-Generated Text (MGT) detection, a task that discriminates MGT from Human-Written Text (HWT), plays a crucial role in preventing misuse of text generative models, which excel in mimicking human writing style recently. Latest proposed detectors usually take coarse text sequence as input and output some good results by fine-tune pretrained models with standard cross-entropy loss. However, these methods fail to consider the linguistic aspect of text (e.g., coherence) and sentence-level structures. Moreover, they lack the ability to handle the low-resource problem which could often happen in practice considering the enormous amount of textual data online. In this paper, we present a coherence-based contrastive learning model named CoCo to detect the possible MGT under low-resource scenario. Inspired by the distinctiveness and permanence properties of linguistic feature, we represent text as a coherence graph to capture its entity consistency, which is further encoded by the pretrained model and graph neural network. To tackle the challenges of data limitations, we employ a contrastive learning framework and propose an improved contrastive loss for making full use of hard negative samples in training stage. The experiment results on two public datasets prove our approach outperforms the state-of-art methods significantly.
translated by 谷歌翻译
Communication is supposed to improve multi-agent collaboration and overall performance in cooperative Multi-agent reinforcement learning (MARL). However, such improvements are prevalently limited in practice since most existing communication schemes ignore communication overheads (e.g., communication delays). In this paper, we demonstrate that ignoring communication delays has detrimental effects on collaborations, especially in delay-sensitive tasks such as autonomous driving. To mitigate this impact, we design a delay-aware multi-agent communication model (DACOM) to adapt communication to delays. Specifically, DACOM introduces a component, TimeNet, that is responsible for adjusting the waiting time of an agent to receive messages from other agents such that the uncertainty associated with delay can be addressed. Our experiments reveal that DACOM has a non-negligible performance improvement over other mechanisms by making a better trade-off between the benefits of communication and the costs of waiting for messages.
translated by 谷歌翻译
Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges to a normal distribution. However, this normal distribution has unknown asymptotic covariance; thus cannot be directly applied to measure the uncertainty. To fill this gap, we develop two estimators for the asymptotic covariance of ROOT-SGD. Our covariance estimators are useful for statistical inference in ROOT-SGD. Our first estimator adopts the idea of plug-in. For each unknown component in the formula of the asymptotic covariance, we substitute it with its empirical counterpart. The plug-in estimator converges at the rate $\mathcal{O}(1/\sqrt{t})$, where $t$ is the sample size. Despite its quick convergence, the plug-in estimator has the limitation that it relies on the Hessian of the loss function, which might be unavailable in some cases. Our second estimator is a Hessian-free estimator that overcomes the aforementioned limitation. The Hessian-free estimator uses the random-scaling technique, and we show that it is an asymptotically consistent estimator of the true covariance.
translated by 谷歌翻译
It is well believed that the higher uncertainty in a word of the caption, the more inter-correlated context information is required to determine it. However, current image captioning methods usually consider the generation of all words in a sentence sequentially and equally. In this paper, we propose an uncertainty-aware image captioning framework, which parallelly and iteratively operates insertion of discontinuous candidate words between existing words from easy to difficult until converged. We hypothesize that high-uncertainty words in a sentence need more prior information to make a correct decision and should be produced at a later stage. The resulting non-autoregressive hierarchy makes the caption generation explainable and intuitive. Specifically, we utilize an image-conditioned bag-of-word model to measure the word uncertainty and apply a dynamic programming algorithm to construct the training pairs. During inference, we devise an uncertainty-adaptive parallel beam search technique that yields an empirically logarithmic time complexity. Extensive experiments on the MS COCO benchmark reveal that our approach outperforms the strong baseline and related methods on both captioning quality as well as decoding speed.
translated by 谷歌翻译
Quantum computing is a game-changing technology for global academia, research centers and industries including computational science, mathematics, finance, pharmaceutical, materials science, chemistry and cryptography. Although it has seen a major boost in the last decade, we are still a long way from reaching the maturity of a full-fledged quantum computer. That said, we will be in the Noisy-Intermediate Scale Quantum (NISQ) era for a long time, working on dozens or even thousands of qubits quantum computing systems. An outstanding challenge, then, is to come up with an application that can reliably carry out a nontrivial task of interest on the near-term quantum devices with non-negligible quantum noise. To address this challenge, several near-term quantum computing techniques, including variational quantum algorithms, error mitigation, quantum circuit compilation and benchmarking protocols, have been proposed to characterize and mitigate errors, and to implement algorithms with a certain resistance to noise, so as to enhance the capabilities of near-term quantum devices and explore the boundaries of their ability to realize useful applications. Besides, the development of near-term quantum devices is inseparable from the efficient classical simulation, which plays a vital role in quantum algorithm design and verification, error-tolerant verification and other applications. This review will provide a thorough introduction of these near-term quantum computing techniques, report on their progress, and finally discuss the future prospect of these techniques, which we hope will motivate researchers to undertake additional studies in this field.
translated by 谷歌翻译
Computing empirical Wasserstein distance in the independence test is an optimal transport (OT) problem with a special structure. This observation inspires us to study a special type of OT problem and propose a modified Hungarian algorithm to solve it exactly. For an OT problem involving two marginals with $m$ and $n$ atoms ($m\geq n$), respectively, the computational complexity of the proposed algorithm is $O(m^2n)$. Computing the empirical Wasserstein distance in the independence test requires solving this special type of OT problem, where $m=n^2$. The associated computational complexity of the proposed algorithm is $O(n^5)$, while the order of applying the classic Hungarian algorithm is $O(n^6)$. In addition to the aforementioned special type of OT problem, it is shown that the modified Hungarian algorithm could be adopted to solve a wider range of OT problems. Broader applications of the proposed algorithm are discussed -- solving the one-to-many and the many-to-many assignment problems. Numerical experiments are conducted to validate our theoretical results. The experiment results demonstrate that the proposed modified Hungarian algorithm compares favorably with the Hungarian algorithm and the well-known Sinkhorn algorithm.
translated by 谷歌翻译