我们研究人员重新识别(RE-ID)的向后兼容问题,该问题旨在限制更新的新模型的功能,以与画廊中旧模型的现有功能相提并论。大多数现有作品都采用基于蒸馏的方法,这些方法着重于推动新功能模仿旧功能。但是,基于蒸馏的方法本质上是最佳的,因为它迫使新的特征空间模仿旧特征空间。为了解决这个问题,我们提出了基于排名的向后兼容学习(RBCL),该学习直接优化了新功能和旧功能之间的排名指标。与以前的方法不同,RBCL仅推动新功能以在旧功能空间而不是严格对齐中找到最佳的位置,并且与向后检索的最终目标保持一致。但是,用于使排名度量可区分的尖锐的Sigmoid函数也会导致梯度消失的问题,因此在训练后期的时期造成了排名的完善。为了解决这个问题,我们提出了动态梯度重新激活(DGR),可以通过在远期步骤中添加动态计算的常数来重新激活抑制梯度。为了进一步帮助目标最佳位置,我们包括邻居上下文代理(NCAS),以近似训练期间的整个旧特征空间。与以前仅在内域设置上测试的作品不同,我们首次尝试引入跨域设置(包括受监督和无监督的),这更有意义和困难。所有五个设置上的实验结果表明,在所有设置下,提出的RBCL都以大幅度优于先前的最新方法。
translated by 谷歌翻译
最近,由于受监督人员重新识别(REID)的表现不佳,域名概括(DG)人REID引起了很多关注,旨在学习一个不敏感的模型,并可以抵抗域的影响偏见。在本文中,我们首先通过实验验证样式因素是域偏差的重要组成部分。基于这个结论,我们提出了一种样式变量且无关紧要的学习方法(SVIL)方法,以消除样式因素对模型的影响。具体来说,我们在SVIL中设计了样式的抖动模块(SJM)。 SJM模块可以丰富特定源域的样式多样性,并减少各种源域的样式差异。这导致该模型重点关注与身份相关的信息,并对样式变化不敏感。此外,我们将SJM模块与元学习算法有机结合,从而最大程度地提高了好处并进一步提高模型的概括能力。请注意,我们的SJM模块是插件和推理,无需成本。广泛的实验证实了我们的SVIL的有效性,而我们的方法的表现优于DG-REID基准测试的最先进方法。
translated by 谷歌翻译
This paper explores a simple and efficient baseline for person re-identification (ReID). Person re-identification (ReID) with deep neural networks has made progress and achieved high performance in recent years. However, many state-of-the-arts methods design complex network structure and concatenate multi-branch features. In the literature, some effective training tricks are briefly appeared in several papers or source codes. This paper will collect and evaluate these effective training tricks in person ReID. By combining these tricks together, the model achieves 94.5% rank-1 and 85.9% mAP on Market1501 with only using global features. Our codes and models are available at https://github.com/michuanhaohao/reid-strong-baseline * Equal contributions. This work was partially done when Hao Luo and Xingyu Liao were interns at Megvii Inc.
translated by 谷歌翻译
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings. The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets. We first conduct a comprehensive overview with in-depth analysis for closed-world person Re-ID from three different perspectives, including deep feature representation learning, deep metric learning and ranking optimization. With the performance saturation under closed-world setting, the research focus for person Re-ID has recently shifted to the open-world setting, facing more challenging issues. This setting is closer to practical applications under specific scenarios. We summarize the open-world Re-ID in terms of five different aspects. By analyzing the advantages of existing methods, we design a powerful AGW baseline, achieving state-of-the-art or at least comparable performance on twelve datasets for FOUR different Re-ID tasks. Meanwhile, we introduce a new evaluation metric (mINP) for person Re-ID, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re-ID system for real applications. Finally, some important yet under-investigated open issues are discussed.
translated by 谷歌翻译
最近的研究表明,明确的深度特征匹配以及大规模和多样化的训练数据都可以显着提高人员重新识别的泛化。然而,在大规模数据上学习深度匹配者的效率尚未得到充分研究。虽然使用分类参数或课程内存是一种流行的方式,但它会引发大的内存和计算成本。相比之下,迷你批量内的成对深度度量学习将是一个更好的选择。然而,最受欢迎的随机采样方法,众所周知的PK采样器,对深度度量学习不是信息性和有效的。虽然在线硬示例挖掘在一定程度上提高了学习效率,但随机采样后迷你批次仍然有限。这激发了我们在数据采样阶段之前探讨了先前使用硬示例挖掘。为此,在本文中,我们提出了一种有效的跨批量采样方法,称为图形采样(GS),用于大规模深度度量学习。基本思想是为每个时代开始的所有类构建最近的邻居关系图。然后,每个迷你批处理由随机选择的类和其最近的邻类组成,以便为学习提供信息和具有挑战性的例子。与适应的竞争性基线一起,我们在更广泛的人中改善了先前的最先进状态,在MAP中最明显重新鉴定,高达24%和13.8%。此外,所提出的方法还优于竞争性基线在地图中排名-1和5.3%的竞争性基线。同时,培训时间明显减少了多达五次,例如五次。在具有8,000个身份的大型数据集中培训12.2小时至2.3小时。代码可在https://github.com/shengcailiao/qaconv获得。
translated by 谷歌翻译
优化平均精度(AP)的近似已被广泛研究图像检索。受AP的定义有限,这些方法考虑在每个阳性实例之前的负数和正面情况。但是,我们声称只在积极的情况下惩罚负面情况,因为损失只来自这些负面情况。为此,我们提出了一种新的损失,即惩罚正面(PNP)的负面情况,这可以直接最小化每个正面前的负实例的数量。此外,基于AP的方法采用固定和次优梯度分配策略。因此,我们通过构建损耗的衍生功能来系统地调查不同的梯度分配解决方案,导致PNP-I具有增加的衍生函数和PNP-D,其具有减小的函数。 PNP-I通过为它们分配更大的渐变并尝试使所有相关实例更近的较大渐变来重点缩影。相比之下,PNP-D对此类实例的关注不那么注意,并慢慢纠正它们。对于大多数真实世界的数据,一类通常包含几个本地群集。 PNP-我盲目地聚集了这些群集,而PNP-D保持它们。因此,PNP-D更优越。三个标准检索数据集的实验显示了上述分析的一致结果。广泛的评估表明PNP-D实现了最先进的性能。代码在https://github.com/interestingzhuo/pnp_loss获得
translated by 谷歌翻译
In this paper, we are interested in learning a generalizable person re-identification (re-ID) representation from unlabeled videos. Compared with 1) the popular unsupervised re-ID setting where the training and test sets are typically under the same domain, and 2) the popular domain generalization (DG) re-ID setting where the training samples are labeled, our novel scenario combines their key challenges: the training samples are unlabeled, and collected form various domains which do no align with the test domain. In other words, we aim to learn a representation in an unsupervised manner and directly use the learned representation for re-ID in novel domains. To fulfill this goal, we make two main contributions: First, we propose Cycle Association (CycAs), a scalable self-supervised learning method for re-ID with low training complexity; and second, we construct a large-scale unlabeled re-ID dataset named LMP-video, tailored for the proposed method. Specifically, CycAs learns re-ID features by enforcing cycle consistency of instance association between temporally successive video frame pairs, and the training cost is merely linear to the data size, making large-scale training possible. On the other hand, the LMP-video dataset is extremely large, containing 50 million unlabeled person images cropped from over 10K Youtube videos, therefore is sufficient to serve as fertile soil for self-supervised learning. Trained on LMP-video, we show that CycAs learns good generalization towards novel domains. The achieved results sometimes even outperform supervised domain generalizable models. Remarkably, CycAs achieves 82.2% Rank-1 on Market-1501 and 49.0% Rank-1 on MSMT17 with zero human annotation, surpassing state-of-the-art supervised DG re-ID methods. Moreover, we also demonstrate the superiority of CycAs under the canonical unsupervised re-ID and the pretrain-and-finetune scenarios.
translated by 谷歌翻译
最近,许多方法通过基于伪标签的对比学习来解决无监督的域自适应人员重新识别(UDA RE-ID)问题。在培训期间,通过简单地平均来自具有相同伪标签的集群的所有实例特征来获得UNI-Firedroid表示。然而,由于群集结果不完美的聚类结果,群集可能包含具有不同标识(标签噪声)的图像,这使得UNI质心表示不适当。在本文中,我们介绍了一种新的多质心存储器(MCM),以在群集中自适应地捕获不同的身份信息。 MCM可以通过为查询图像选择适当的正/负质心来有效地减轻标签噪声问题。此外,我们进一步提出了两种策略来改善对比学习过程。首先,我们介绍了一个域特定的对比度学习(DSCL)机制,通过仅通过相同域进行比较样本来完全探索局部信息。其次,我们提出了二阶最近的插值(Soni)以获得丰富和信息性的负样本。我们将MCM,DSCL和Soni集成到一个名为Multi-Firedroid表示网络(MCRN)的统一框架中。广泛的实验证明了MCRN在多个UDA重新ID任务上的最先进方法和完全无监督的重新ID任务的优越性。
translated by 谷歌翻译
无监督的人重新识别(RE-ID)由于其可扩展性和对现实世界应用的可能性而吸引了增加的研究兴趣。最先进的无监督的重新ID方法通常遵循基于聚类的策略,该策略通过聚类来生成伪标签,并维护存储器以存储实例功能并代表群集的质心进行对比​​学习。这种方法遇到了两个问题。首先,无监督学习产生的质心可能不是一个完美的原型。强迫图像更接近质心,强调了聚类的结果,这可能会在迭代过程中积累聚类错误。其次,以前的方法利用在不同的训练迭代中获得的功能代表一种质心,这与当前的训练样本不一致,因为这些特征不是直接可比的。为此,我们通过随机学习策略提出了一种无监督的重新ID方法。具体来说,我们采用了随机更新的内存,其中使用集群的随机实例来更新群集级内存以进行对比度学习。这样,学会了随机选择的图像对之间的关​​系,以避免由不可靠的伪标签引起的训练偏见。随机内存也始终是最新的,以保持一致性。此外,为了减轻摄像机方差的问题,在聚类过程中提出了一个统一的距离矩阵,其中减少了不同摄像头域的距离偏置,并强调了身份的差异。
translated by 谷歌翻译
大多数深度度量学习(DML)方法采用了一种策略,该策略迫使所有积极样本在嵌入空间中靠近,同时使它们远离负面样本。但是,这种策略忽略了正(负)样本的内部关系,并且通常导致过度拟合,尤其是在存在硬样品和标签错误的情况下。在这项工作中,我们提出了一个简单而有效的正则化,即列表自我验证(LSD),该化逐渐提炼模型的知识,以适应批处理中每个样本对的更合适的距离目标。LSD鼓励在正(负)样本中更平稳的嵌入和信息挖掘,以减轻过度拟合并从而改善概括。我们的LSD可以直接集成到一般的DML框架中。广泛的实验表明,LSD始终提高多个数据集上各种度量学习方法的性能。
translated by 谷歌翻译
在图像检索中,标准评估度量依赖于分数排名,例如:平均精度(AP)。在本文中,我们介绍了一种稳健和可分解的平均精度(路线图)的方法,解决了对AP的深神经网络的端到端训练的两个主要挑战:非差异性和不分解性。首先,我们提出了一种新的等级函数的新可分辨性近似,这提供了AP损耗的上限并确保了鲁棒训练。其次,我们设计简单但有效的损失功能,以减少整个训练集中的AP之间的分解性差距及其平均批量近似,我们提供理论保证。在三个图像检索数据集上进行的广泛实验表明,路线图优于最近的几种AP近似方法,并突出了我们两个贡献的重要性。最后,使用用于训练的路线图,深度模型产生非常好的表现,表现出三个数据集的最先进结果。
translated by 谷歌翻译
具有大量空间和时间跨境的情景中的人重新识别(RE-ID)尚未完全探索。这部分原因是,现有的基准数据集主要由有限的空间和时间范围收集,例如,使用在校园特定区域的相机录制的视频中使用的视频。这种有限的空间和时间范围使得难以模拟真实情景中的人的困难。在这项工作中,我们贡献了一个新的大型时空上次最后一个数据集,包括10,862个图像,具有超过228k的图像。与现有数据集相比,最后一个具有挑战性和高度多样性的重新ID设置,以及显着更大的空间和时间范围。例如,每个人都可以出现在不同的城市或国家,以及在白天到夜间的各个时隙,以及春季到冬季的不同季节。为了我们的最佳知识,最后是一个新的Perse Re-ID数据集,具有最大的时空范围。基于最后,我们通过对14个RE-ID算法进行全面的绩效评估来验证其挑战。我们进一步提出了一种易于实施的基线,适用于如此挑战的重新ID设置。我们还验证了初步训练的模型可以在具有短期和更改方案的现有数据集中概括。我们期待持续激发未来的工程,以更现实和挑战的重新识别任务。有关DataSet的更多信息,请访问https://github.com/shuxjweb/last.git。
translated by 谷歌翻译
最近,无监督的人重新识别(RE-ID)引起了人们的关注,因为其开放世界情景设置有限,可用的带注释的数据有限。现有的监督方法通常无法很好地概括在看不见的域上,而无监督的方法(大多数缺乏多范围的信息),并且容易患有确认偏见。在本文中,我们旨在从两个方面从看不见的目标域上找到更好的特征表示形式,1)在标记的源域上进行无监督的域适应性和2)2)在未标记的目标域上挖掘潜在的相似性。此外,提出了一种协作伪标记策略,以减轻确认偏见的影响。首先,使用生成对抗网络将图像从源域转移到目标域。此外,引入了人身份和身份映射损失,以提高生成图像的质量。其次,我们提出了一个新颖的协作多元特征聚类框架(CMFC),以学习目标域的内部数据结构,包括全局特征和部分特征分支。全球特征分支(GB)在人体图像的全球特征上采用了无监督的聚类,而部分特征分支(PB)矿山在不同人体区域内的相似性。最后,在两个基准数据集上进行的广泛实验表明,在无监督的人重新设置下,我们的方法的竞争性能。
translated by 谷歌翻译
本文从跨模式度量学习的角度来解决基于零点草图的图像检索(ZS-SBIR)问题。此任务具有两个特性:1)零拍摄设置需要具有良好的课堂紧凑性和识别新颖类别的课堂间差异的度量空间,而2)草图查询和照片库是不同的模态。从两个方面,公制学习视点益处ZS-SBIR。首先,它促进了深度度量学习(DML)中最近的良好实践的改进。通过在DML中结合两种基本学习方法,例如分类培训和成对培训,我们为ZS-SBIR设置了一个强大的基线。没有钟声和口哨,这种基线实现了竞争的检索准确性。其次,它提供了一个正确抑制模态间隙至关重要的洞察力。为此,我们设计了一种名为Domency Ippar Triplet硬挖掘(Mathm)的新颖方法。 Mathm增强了基线,具有三种类型的成对学习,例如跨模型样本对,模态样本对,以及它们的组合。\我们还设计了一种自适应加权方法,可以在动态训练期间平衡这三个组件。实验结果证实,Mathm根据强大的基线带来另一轮显着改进,并建立了新的最先进的性能。例如,在Tu-Berlin数据集上,我们达到了47.88 + 2.94%地图@全部和58.28 + 2.34%prip @ 100。代码将在:https://github.com/huangzongheng/mathm公开使用。
translated by 谷歌翻译
Person re-identification is a challenging task because of the high intra-class variance induced by the unrestricted nuisance factors of variations such as pose, illumination, viewpoint, background, and sensor noise. Recent approaches postulate that powerful architectures have the capacity to learn feature representations invariant to nuisance factors, by training them with losses that minimize intra-class variance and maximize inter-class separation, without modeling nuisance factors explicitly. The dominant approaches use either a discriminative loss with margin, like the softmax loss with the additive angular margin, or a metric learning loss, like the triplet loss with batch hard mining of triplets. Since the softmax imposes feature normalization, it limits the gradient flow supervising the feature embedding. We address this by joining the losses and leveraging the triplet loss as a proxy for the missing gradients. We further improve invariance to nuisance factors by adding the discriminative task of predicting attributes. Our extensive evaluation highlights that when only a holistic representation is learned, we consistently outperform the state-of-the-art on the three most challenging datasets. Such representations are easier to deploy in practical systems. Finally, we found that joining the losses removes the requirement for having a margin in the softmax loss while increasing performance.
translated by 谷歌翻译
人重新识别(REID)与跨不同相机的行人相匹配。采用真实功能描述符的现有REID方法已经达到了很高的精度,但是由于缓慢的欧几里得距离计算以及复杂的快速算法,它们的效率很低。最近,一些作品建议生产二进制编码的人描述符,而这些描述符仅需要快速锤击距离计算和简单的计数算法。但是,考虑到稀疏的二进制空间,这种二进制编码的描述符的性能,尤其是使用短代码(例如32位和64位)的性能几乎是令人满意的。为了在模型的准确性和效率之间取得平衡,我们提出了一种新颖的子空间一致性(SCR)算法,该算法可以比在相同维度下的实现功能,同时维持竞争精度,比实际价值的功能加快REID程序的$ 0.25 $倍。尤其是在简短的代码下。 SCR转换实价特征向量(例如,2048 Float32),带有简短的二进制代码(例如64位),首先将房地产品向量向量向量矢量分为$ M $子空间,每个vector vector vector vector value value value value value value value value value value value value coppection vetor vector vector vector vector vector vection velets velects velects velects velects vare cob $ m $ subpaces。因此,两个样品之间的距离可以表示为与质心相应距离的求和,可以通过离线计算加速并通过查找表维护。另一方面,与使用二进制代码相比,这些真实价值的质心有助于实现明显更高的准确性。最后,我们将距离查找表转换为整数,并应用计数算法以加快排名阶段。我们还提出了一个具有迭代框架的新型一致性正则化。 Market-1501和Dukemtmc-Reid的实验结果显示出令人鼓舞和令人兴奋的结果。在简短的代码下,我们拟议的SCR享有真实价值的准确性和哈希级速度。
translated by 谷歌翻译
Lifelong person re-identification (LReID) is in significant demand for real-world development as a large amount of ReID data is captured from diverse locations over time and cannot be accessed at once inherently. However, a key challenge for LReID is how to incrementally preserve old knowledge and gradually add new capabilities to the system. Unlike most existing LReID methods, which mainly focus on dealing with catastrophic forgetting, our focus is on a more challenging problem, which is, not only trying to reduce the forgetting on old tasks but also aiming to improve the model performance on both new and old tasks during the lifelong learning process. Inspired by the biological process of human cognition where the somatosensory neocortex and the hippocampus work together in memory consolidation, we formulated a model called Knowledge Refreshing and Consolidation (KRC) that achieves both positive forward and backward transfer. More specifically, a knowledge refreshing scheme is incorporated with the knowledge rehearsal mechanism to enable bi-directional knowledge transfer by introducing a dynamic memory model and an adaptive working model. Moreover, a knowledge consolidation scheme operating on the dual space further improves model stability over the long term. Extensive evaluations show KRC's superiority over the state-of-the-art LReID methods on challenging pedestrian benchmarks.
translated by 谷歌翻译
人重新识别(RE-ID)是视频监视系统中的一项关键技术,在监督环境中取得了重大成功。但是,由于可用源域和看不见的目标域之间的域间隙,很难将监督模型直接应用于任意看不见的域。在本文中,我们提出了一种新颖的标签分布学习(LDL)方法,以解决可推广的多源人员重新ID任务(即,有多个可用的源域,并且在培训期间看不到测试域),旨在旨在探索不同类别的关系,并减轻跨不同域的域转移,以改善模型的歧视并同时学习域不变特征。具体而言,在培训过程中,我们通过在线方式生产标签分布来挖掘不同类别的关系信息,因此它有益于提取判别特征。此外,对于每个类别的标签分布,我们进一步对其进行了修改,以更多和同等的关注该类不属于的其他域,这可以有效地减少跨不同域的域间隙并获得域不变特征。此外,我们还提供了理论分析,以证明所提出的方法可以有效地处理域转移问题。在多个基准数据集上进行的广泛实验验证了所提出的方法的有效性,并表明所提出的方法可以胜过最先进的方法。此外,进一步的分析还揭示了所提出的方法的优越性。
translated by 谷歌翻译
Recent methods for deep metric learning have been focusing on designing different contrastive loss functions between positive and negative pairs of samples so that the learned feature embedding is able to pull positive samples of the same class closer and push negative samples from different classes away from each other. In this work, we recognize that there is a significant semantic gap between features at the intermediate feature layer and class labels at the final output layer. To bridge this gap, we develop a contrastive Bayesian analysis to characterize and model the posterior probabilities of image labels conditioned by their features similarity in a contrastive learning setting. This contrastive Bayesian analysis leads to a new loss function for deep metric learning. To improve the generalization capability of the proposed method onto new classes, we further extend the contrastive Bayesian loss with a metric variance constraint. Our experimental results and ablation studies demonstrate that the proposed contrastive Bayesian metric learning method significantly improves the performance of deep metric learning in both supervised and pseudo-supervised scenarios, outperforming existing methods by a large margin.
translated by 谷歌翻译
Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks. Dozens of papers in the field of FR are published every year. Some of them were applied in the industrial community and played an important role in human life such as device unlock, mobile payment, and so on. This paper provides an introduction to face recognition, including its history, pipeline, algorithms based on conventional manually designed features or deep learning, mainstream training, evaluation datasets, and related applications. We have analyzed and compared state-of-the-art works as many as possible, and also carefully designed a set of experiments to find the effect of backbone size and data distribution. This survey is a material of the tutorial named The Practical Face Recognition Technology in the Industrial World in the FG2023.
translated by 谷歌翻译