In this paper, we are interested in learning a generalizable person re-identification (re-ID) representation from unlabeled videos. Compared with 1) the popular unsupervised re-ID setting where the training and test sets are typically under the same domain, and 2) the popular domain generalization (DG) re-ID setting where the training samples are labeled, our novel scenario combines their key challenges: the training samples are unlabeled, and collected form various domains which do no align with the test domain. In other words, we aim to learn a representation in an unsupervised manner and directly use the learned representation for re-ID in novel domains. To fulfill this goal, we make two main contributions: First, we propose Cycle Association (CycAs), a scalable self-supervised learning method for re-ID with low training complexity; and second, we construct a large-scale unlabeled re-ID dataset named LMP-video, tailored for the proposed method. Specifically, CycAs learns re-ID features by enforcing cycle consistency of instance association between temporally successive video frame pairs, and the training cost is merely linear to the data size, making large-scale training possible. On the other hand, the LMP-video dataset is extremely large, containing 50 million unlabeled person images cropped from over 10K Youtube videos, therefore is sufficient to serve as fertile soil for self-supervised learning. Trained on LMP-video, we show that CycAs learns good generalization towards novel domains. The achieved results sometimes even outperform supervised domain generalizable models. Remarkably, CycAs achieves 82.2% Rank-1 on Market-1501 and 49.0% Rank-1 on MSMT17 with zero human annotation, surpassing state-of-the-art supervised DG re-ID methods. Moreover, we also demonstrate the superiority of CycAs under the canonical unsupervised re-ID and the pretrain-and-finetune scenarios.
translated by 谷歌翻译
在许多综合设置(例如视频游戏)和GO中,增强学习(RL)超出了人类的绩效。但是,端到端RL模型的现实部署不太常见,因为RL模型对环境的轻微扰动非常敏感。强大的马尔可夫决策过程(MDP)框架(其中的过渡概率属于名义模型设置的不确定性)提供了一种开发健壮模型的方法。虽然先前的分析表明,RL算法是有效的,假设访问生成模型,但尚不清楚RL在更现实的在线设置下是否可以有效,这需要在探索和开发之间取得仔细的平衡。在这项工作中,我们通过与未知的名义系统进行互动来考虑在线强大的MDP。我们提出了一种强大的乐观策略优化算法,该算法可有效。为了解决由对抗性环境引起的其他不确定性,我们的模型具有通过Fenchel Conjugates得出的新的乐观更新规则。我们的分析确定了在线强大MDP的第一个遗憾。
translated by 谷歌翻译
我们介绍了一种新的图像取证方法:将物理折射物(我们称为图腾)放入场景中,以保护该场景拍摄的任何照片。图腾弯曲并重定向光线,因此在单个图像中提供了多个(尽管扭曲)的多个(尽管扭曲)。防守者可以使用这些扭曲的图腾像素来检测是否已操纵图像。我们的方法通过估计场景中的位置并使用其已知的几何和材料特性来估算其位置,从而使光线通过图腾的光线不十障。为了验证图腾保护的图像,我们从图腾视点重建的场景与场景的外观从相机的角度来检测到不一致之处。这样的方法使对抗性操纵任务更加困难,因为对手必须以几何一致的方式对图腾和图像像素进行修改,而又不知道图腾的物理特性。与先前的基于学习的方法不同,我们的方法不需要在特定操作的数据集上进行培训,而是使用场景和相机的物理属性来解决取证问题。
translated by 谷歌翻译
在本文中,我们将解决方案介绍给Muse-Humor的多模式情感挑战(MUSE)2022的邮件,库穆尔人子挑战的目标是发现幽默并从德国足球馆的视听录音中计算出AUC新闻发布会。它是针对教练表现出的幽默的注释。对于此子挑战,我们首先使用变压器模块和BilstM模块构建一个判别模型,然后提出一种混合融合策略,以使用每种模式的预测结果来提高模型的性能。我们的实验证明了我们提出的模型和混合融合策略对多模式融合的有效性,并且我们在测试集中提出的模型的AUC为0.8972。
translated by 谷歌翻译
自动检测异常轨迹是智能运输系统中大量应用的重要问题。许多现有的研究集中在区分异常轨迹和正常轨迹上,忽略了异常轨迹之间的巨大差异。最近的一项研究在鉴定异常轨迹模式方面取得了长足进步,并提出了一种两阶段算法,用于异常轨迹检测和分类(ATDC)。该算法具有出色的性能,但受到了一些局限性,例如高时间的复杂性和不良的解释。在这里,我们对ATDC算法进行了仔细的理论和经验分析,表明可以简化两个阶段的异常得分的计算,并且该算法的第二阶段比第一阶段重要得多。因此,我们开发了一种FastATDC算法,该算法在两个阶段都引入了随机抽样策略。实验结果表明,FastATDC在实际数据集上的速度比ATDC快10到20倍。此外,FastAtDC优于基线算法,与ATDC算法相当。
translated by 谷歌翻译
大规模矢量映射对于运输,城市规划,调查和人口普查很重要。我们提出了GraphMapper,这是从卫星图像中提取端到端向量图的统一框架。我们的关键思想是一种新颖的统一表示,称为“原始图”的不同拓扑的形状,这是一组形状原语及其成对关系矩阵。然后,我们将向量形状的预测,正则化和拓扑重构转换为独特的原始图学习问题。具体而言,GraphMapper是一个基于多头注意的全局形状上下文建模的通用原始图形学习网络。开发了一种嵌入式空间排序方法,用于准确的原始关系建模。我们从经验上证明了GraphMapper对两个具有挑战性的映射任务的有效性,即建立足迹正则化和道路网络拓扑重建。我们的模型在公共基准上的两项任务中都优于最先进的方法。所有代码将公开可用。
translated by 谷歌翻译
动态图可视化吸引了研究人员的集中度,因为它代表了多个领域的实体之间的时变关系(例如,社交媒体分析,学术合作分析,团队运动分析)。集成视觉分析方法对于呈现,比较和审查动态图是结果的。即使开发了多年的动态图可视化,但是如何有效地可视化具有微妙变化的大规模和时间密集型动态图数据对研究人员仍然具有挑战性。为了为此类动态图数据提供有效的分析方法,我们提出了一种快照生成算法,该算法涉及人类中的人类,以帮助用户将动态图分为多粒性和分层快照,以进一步分析。此外,我们设计了视觉分析原型系统(DGSVI),以帮助用户有效访问动态图见解。 DGSVI集成了图形操作接口,以帮助用户在视觉上和交互式上生成快照。它配备了可视化动态图数据的层次快照的概述和详细信息。为了说明我们提出的此类动态图数据的建议方法的可用性和效率,我们在竞争中介绍了基于篮球运动员网络的两个案例研究。此外,我们进行了评估,并收到经验丰富的可视化专家的激动人心的反馈。
translated by 谷歌翻译
我们考虑将人体网格重建模型调整为域外流媒体视频的新问题,其中现有的基于SMPL的模型的性能受到不同相机参数,骨长,背景和闭塞的分布换档的显着影响。我们通过在线适应来解决这个问题,逐渐在测试期间纠正模型偏差。有两个主要挑战:首先,缺乏3D注释增加了培训难度并导致3D模糊。其次,非静止数据分布使得难以在拟合常规帧和硬样之间的平衡,具有严重的闭塞或戏剧性的变化。为此,我们提出了动态Bilevel在线适应算法(Dynaboa)。它首先介绍了用于补偿不可用的3D注释的时间约束,并利用BileVel优化过程来解决多目标之间的冲突。 Dynaboa通过使用类似的来源示例提供了额外的3D指导,尽管分布换档。此外,它可以自适应地调整各个帧上的​​优化步骤的数量,以完全适合硬样品并避免过度拟合常规帧。 Dynaboa在三个域名人网格重建基准上实现最先进的结果。
translated by 谷歌翻译
The goal of this paper is to estimate the 6D pose and dimensions of unseen object instances in an RGB-D image. Contrary to "instance-level" 6D pose estimation tasks, our problem assumes that no exact object CAD models are available during either training or testing time. To handle different and unseen object instances in a given category, we introduce Normalized Object Coordinate Space (NOCS)-a shared canonical representation for all possible object instances within a category. Our region-based neural network is then trained to directly infer the correspondence from observed pixels to this shared object representation (NOCS) along with other object information such as class label and instance mask. These predictions can be combined with the depth map to jointly estimate the metric 6D pose and dimensions of multiple objects in a cluttered scene. To train our network, we present a new contextaware technique to generate large amounts of fully annotated mixed reality data. To further improve our model and evaluate its performance on real data, we also provide a fully annotated real-world dataset with large environment and instance variation. Extensive experiments demonstrate that the proposed method is able to robustly estimate the pose and size of unseen object instances in real environments while also achieving state-of-the-art performance on standard 6D pose estimation benchmarks.
translated by 谷歌翻译
Digital engineering transformation is a crucial process for the engineering paradigm shifts in the fourth industrial revolution (4IR), and artificial intelligence (AI) is a critical enabling technology in digital engineering transformation. This article discusses the following research questions: What are the fundamental changes in the 4IR? More specifically, what are the fundamental changes in engineering? What is digital engineering? What are the main uncertainties there? What is trustworthy AI? Why is it important today? What are emerging engineering paradigm shifts in the 4IR? What is the relationship between the data-intensive paradigm and digital engineering transformation? What should we do for digitalization? From investigating the pattern of industrial revolutions, this article argues that ubiquitous machine intelligence (uMI) is the defining power brought by the 4IR. Digitalization is a condition to leverage ubiquitous machine intelligence. Digital engineering transformation towards Industry 4.0 has three essential building blocks: digitalization of engineering, leveraging ubiquitous machine intelligence, and building digital trust and security. The engineering design community at large is facing an excellent opportunity to bring the new capabilities of ubiquitous machine intelligence and trustworthy AI principles, as well as digital trust, together in various engineering systems design to ensure the trustworthiness of systems in Industry 4.0.
translated by 谷歌翻译