已显示现有的面部分析系统对某些人口统计亚组产生偏见的结果。由于其对社会的影响,因此必须确保这些系统不会根据个人的性别,身份或肤色歧视。这导致了在AI系统中识别和减轻偏差的研究。在本文中,我们封装了面部分析的偏置检测/估计和缓解算法。我们的主要贡献包括对拟议理解偏见的算法的系统审查,以及分类和广泛概述现有的偏置缓解算法。我们还讨论了偏见面部分析领域的开放挑战。
translated by 谷歌翻译
The presence of bias in deep models leads to unfair outcomes for certain demographic subgroups. Research in bias focuses primarily on facial recognition and attribute prediction with scarce emphasis on face detection. Existing studies consider face detection as binary classification into 'face' and 'non-face' classes. In this work, we investigate possible bias in the domain of face detection through facial region localization which is currently unexplored. Since facial region localization is an essential task for all face recognition pipelines, it is imperative to analyze the presence of such bias in popular deep models. Most existing face detection datasets lack suitable annotation for such analysis. Therefore, we web-curate the Fair Face Localization with Attributes (F2LA) dataset and manually annotate more than 10 attributes per face, including facial localization information. Utilizing the extensive annotations from F2LA, an experimental setup is designed to study the performance of four pre-trained face detectors. We observe (i) a high disparity in detection accuracies across gender and skin-tone, and (ii) interplay of confounding factors beyond demography. The F2LA data and associated annotations can be accessed at http://iab-rubric.org/index.php/F2LA.
translated by 谷歌翻译
深度神经网络在人类分析中已经普遍存在,增强了应用的性能,例如生物识别识别,动作识别以及人重新识别。但是,此类网络的性能通过可用的培训数据缩放。在人类分析中,对大规模数据集的需求构成了严重的挑战,因为数据收集乏味,廉价,昂贵,并且必须遵守数据保护法。当前的研究研究了\ textit {合成数据}的生成,作为在现场收集真实数据的有效且具有隐私性的替代方案。这项调查介绍了基本定义和方法,在生成和采用合成数据进行人类分析时必不可少。我们进行了一项调查,总结了当前的最新方法以及使用合成数据的主要好处。我们还提供了公开可用的合成数据集和生成模型的概述。最后,我们讨论了该领域的局限性以及开放研究问题。这项调查旨在为人类分析领域的研究人员和从业人员提供。
translated by 谷歌翻译
当前用于面部识别的模型(FR)中存在人口偏见。我们在野外(BFW)数据集中平衡的面孔是衡量种族和性别亚组偏见的代理,使一个人可以表征每个亚组的FR表现。当单个分数阈值确定样本对是真实还是冒名顶替者时,我们显示的结果是非最佳选择的。在亚组中,性能通常与全球平均水平有很大差异。因此,仅适用于与验证数据相匹配的人群的特定错误率。我们使用新的域适应性学习方案来减轻性能不平衡,以使用最先进的神经网络提取的面部特征。该技术平衡了性能,但也可以提高整体性能。该建议的好处是在面部特征中保留身份信息,同时减少其所包含的人口统计信息。人口统计学知识的去除阻止了潜在的未来偏见被注入决策。由于对个人的可用信息或推断,因此此删除可改善隐私。我们定性地探索这一点;我们还定量地表明,亚组分类器不再从提出的域适应方案的特征中学习。有关源代码和数据描述,请参见https://github.com/visionjo/facerec-bias-bfw。
translated by 谷歌翻译
The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals. Various prevention measures were introduced around the world to limit the transmission of the disease, including face masks, mandates for social distancing and regular disinfection in public spaces, and the use of screening applications. These developments also triggered the need for novel and improved computer vision techniques capable of (i) providing support to the prevention measures through an automated analysis of visual data, on the one hand, and (ii) facilitating normal operation of existing vision-based services, such as biometric authentication schemes, on the other. Especially important here, are computer vision techniques that focus on the analysis of people and faces in visual data and have been affected the most by the partial occlusions introduced by the mandates for facial masks. Such computer vision based human analysis techniques include face and face-mask detection approaches, face recognition techniques, crowd counting solutions, age and expression estimation procedures, models for detecting face-hand interactions and many others, and have seen considerable attention over recent years. The goal of this survey is to provide an introduction to the problems induced by COVID-19 into such research and to present a comprehensive review of the work done in the computer vision based human analysis field. Particular attention is paid to the impact of facial masks on the performance of various methods and recent solutions to mitigate this problem. Additionally, a detailed review of existing datasets useful for the development and evaluation of methods for COVID-19 related applications is also provided. Finally, to help advance the field further, a discussion on the main open challenges and future research direction is given.
translated by 谷歌翻译
长期以来,面部识别一直是人工智能领域的一个积极研究领域,尤其是自近年来深度学习的兴起以来。在某些实际情况下,每个身份只有一个可以培训的样本。在这种情况下的面部识别被称为单个样本识别,并对深层模型的有效培训构成了重大挑战。因此,近年来,研究人员试图释放更多的深度学习潜力,并在单个样本情况下提高模型识别性能。尽管已经对传统的单个样本面部识别方法进行了几项全面的调查,但这些评论很少涉及新兴的基于深度学习的方法。因此,我们将重点放在本文中的基于深度学习的方法上,将其分类为虚拟示例方法和通用学习方法。在前一种类别中,生成虚拟图像或虚拟特征以使深层模型的训练受益。在后者中,使用了其他多样本通用集。通用学习方法有三种类型:结合传统方法和深度特征,改善损失功能并改善网络结构,所有这些都涵盖了我们的分析。此外,我们回顾了通常用于评估单个样本面部识别模型的面部数据集,并继续比较不同类型的模型的结果。此外,我们讨论了现有的单个样本面部识别方法的问题,包括虚拟样本方法中的身份信息保存,通用学习方法中的域适应性。此外,我们认为开发无监督的方法是一个有希望的未来方向,并指出语义差距是需要进一步考虑的重要问题。
translated by 谷歌翻译
公平性是一个标准,重点是评估不同人口组的算法性能,它引起了自然语言处理,推荐系统和面部识别的关注。由于医学图像样本中有很多人口统计学属性,因此了解公平的概念,熟悉不公平的缓解技术,评估算法的公平程度并认识到医疗图像分析(媒体)中的公平问题中的挑战很重要。在本文中,我们首先给出了公平性的全面和精确的定义,然后通过在媒体中引入当前使用的技术中使用的技术。之后,我们列出了包含人口统计属性的公共医疗图像数据集,以促进公平研究并总结有关媒体公平性的当前算法。为了帮助更好地理解公平性,并引起人们对媒体中与公平性有关的问题的关注,进行了实验,比较公平性和数据失衡之间的差异,验证各种媒体任务中不公平的存在,尤其是在分类,细分和检测以及评估不公平缓解算法的有效性。最后,我们以媒体公平性的机会和挑战得出结论。
translated by 谷歌翻译
在本文中,我们提出了一个新颖的解释性框架,旨在更好地理解面部识别模型作为基本数据特征的表现(受保护的属性:性别,种族,年龄;非保护属性:面部毛发,化妆品,配件,脸部,面部,面部,面部,面部,面部,它们被测试的变化的方向和阻塞,图像失真,情绪)。通过我们的框架,我们评估了十种最先进的面部识别模型,并在两个数据集上的安全性和可用性方面进行了比较,涉及基于性别和种族的六个小组。然后,我们分析图像特征对模型性能的影响。我们的结果表明,当考虑多归因组时,单属分析中出现的趋势消失或逆转,并且性能差异也与非保护属性有关。源代码:https://cutt.ly/2xwrlia。
translated by 谷歌翻译
Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their current shortcomings. One of them is that neural networks are correlation machines prone to model biases within the data instead of focusing on actual useful causal relationships. This problem is particularly serious in application domains affected by aspects such as race, gender, and age. To prevent models from incurring on unfair decision-making, the AI community has concentrated efforts in correcting algorithmic biases, giving rise to the research area now widely known as fairness in AI. In this survey paper, we provide an in-depth overview of the main debiasing methods for fairness-aware neural networks in the context of vision and language research. We propose a novel taxonomy to better organize the literature on debiasing methods for fairness, and we discuss the current challenges, trends, and important future work directions for the interested researcher and practitioner.
translated by 谷歌翻译
计算机视觉(CV)取得了显着的结果,在几个任务中表现优于人类。尽管如此,如果不正确处理,可能会导致重大歧视,因为CV系统高度依赖于他们所用的数据,并且可以在此类数据中学习和扩大偏见。因此,理解和发现偏见的问题至关重要。但是,没有关于视觉数据集中偏见的全面调查。因此,这项工作的目的是:i)描述可能在视觉数据集中表现出来的偏差; ii)回顾有关视觉数据集中偏置发现和量化方法的文献; iii)讨论现有的尝试收集偏见视觉数据集的尝试。我们研究的一个关键结论是,视觉数据集中发现和量化的问题仍然是开放的,并且在方法和可以解决的偏见范围方面都有改进的余地。此外,没有无偏见的数据集之类的东西,因此科学家和从业者必须意识到其数据集中的偏见并使它们明确。为此,我们提出了一个清单,以在Visual DataSet收集过程中发现不同类型的偏差。
translated by 谷歌翻译
由于视觉识别的社会影响一直受到审查,因此出现了几个受保护的平衡数据集,以解决不平衡数据集中的数据集偏差。但是,在面部属性分类中,数据集偏置既源于受保护的属性级别和面部属性级别,这使得构建多属性级别平衡的真实数据集使其具有挑战性。为了弥合差距,我们提出了一条有效的管道,以产生具有所需面部属性的高质量和足够的面部图像,并将原始数据集补充为两个级别的平衡数据集,从理论上讲,这在理论上满足了几个公平标准。我们方法的有效性在性别分类和面部属性分类方面得到了验证,通过将可比的任务性能作为原始数据集,并通过广泛的度量标准进行全面的公平评估,并进一步提高公平性。此外,我们的方法优于重采样和平衡的数据集构建来解决数据集偏差,以及解决任务偏置的模型模型。
translated by 谷歌翻译
Recent studies demonstrate that machine learning algorithms can discriminate based on classes like race and gender. In this work, we present an approach to evaluate bias present in automated facial analysis algorithms and datasets with respect to phenotypic subgroups. Using the dermatologist approved Fitzpatrick Skin Type classification system, we characterize the gender and skin type distribution of two facial analysis benchmarks, IJB-A and Adience. We find that these datasets are overwhelmingly composed of lighter-skinned subjects (79.6% for IJB-A and 86.2% for Adience) and introduce a new facial analysis dataset which is balanced by gender and skin type. We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%. The substantial disparities in the accuracy of classifying darker females, lighter females, darker males, and lighter males in gender classification systems require urgent attention if commercial companies are to build genuinely fair, transparent and accountable facial analysis algorithms.
translated by 谷歌翻译
数据和算法中固有的偏见使得基于广泛的机器学习(ML)的决策系统的公平性不如最佳。为了提高此类ML决策系统的信任性,至关重要的是要意识到这些解决方案中的固有偏见并使它们对公众和开发商更加透明。在这项工作中,我们旨在提供一组解释性工具,以分析处理不同人口组时面部识别模型行为的差异。我们通过利用基于激活图的高阶统计信息来构建解释性工具来做到这一点,以将FR模型的行为差异与某些面部区域联系起来。与参考组相比,在两个数据集和两个面部识别模型上的实验结果指出了FR模型对某些人口组的反应不同。这些分析的结果有趣地与分析人体测量学差异和人类判断差异的研究结果非常相吻合。因此,这是第一项专门解释不同人口组上FR模型的偏见行为并将其直接链接到空间面部特征的研究。该代码在这里公开可用。
translated by 谷歌翻译
Although significant progress has been made in face recognition, demographic bias still exists in face recognition systems. For instance, it usually happens that the face recognition performance for a certain demographic group is lower than the others. In this paper, we propose MixFairFace framework to improve the fairness in face recognition models. First of all, we argue that the commonly used attribute-based fairness metric is not appropriate for face recognition. A face recognition system can only be considered fair while every person has a close performance. Hence, we propose a new evaluation protocol to fairly evaluate the fairness performance of different approaches. Different from previous approaches that require sensitive attribute labels such as race and gender for reducing the demographic bias, we aim at addressing the identity bias in face representation, i.e., the performance inconsistency between different identities, without the need for sensitive attribute labels. To this end, we propose MixFair Adapter to determine and reduce the identity bias of training samples. Our extensive experiments demonstrate that our MixFairFace approach achieves state-of-the-art fairness performance on all benchmark datasets.
translated by 谷歌翻译
媒体报道指责人们对“偏见”',“”性别歧视“和”种族主义“的人士指责。研究文献中有共识,面部识别准确性为女性较低,妇女通常具有更高的假匹配率和更高的假非匹配率。然而,几乎没有出版的研究,旨在识别女性准确性较低的原因。例如,2019年的面部识别供应商测试将在广泛的算法和数据集中记录较低的女性准确性,并且数据集也列出了“分析原因和效果”在“我们没有做的东西”下''。我们介绍了第一个实验分析,以确定在去以前研究的数据集上对女性的较低人脸识别准确性的主要原因。在测试图像中控制相等的可见面部可见面积减轻了女性的表观更高的假非匹配率。其他分析表明,化妆平衡数据集进一步改善了女性以实现较低的虚假非匹配率。最后,聚类实验表明,两种不同女性的图像本质上比两种不同的男性更相似,潜在地占错误匹配速率的差异。
translated by 谷歌翻译
能够分析和量化人体或行为特征的系统(称为生物识别系统)正在使用和应用变异性增长。由于其从手工制作的功能和传统的机器学习转变为深度学习和自动特征提取,因此生物识别系统的性能增加到了出色的价值。尽管如此,这种快速进步的成本仍然尚不清楚。由于其不透明度,深层神经网络很难理解和分析,因此,由错误动机动机动机的隐藏能力或决定是潜在的风险。研究人员已经开始将注意力集中在理解深度神经网络及其预测的解释上。在本文中,我们根据47篇论文的研究提供了可解释生物识别技术的当前状态,并全面讨论了该领域的发展方向。
translated by 谷歌翻译
Speech-centric machine learning systems have revolutionized many leading domains ranging from transportation and healthcare to education and defense, profoundly changing how people live, work, and interact with each other. However, recent studies have demonstrated that many speech-centric ML systems may need to be considered more trustworthy for broader deployment. Specifically, concerns over privacy breaches, discriminating performance, and vulnerability to adversarial attacks have all been discovered in ML research fields. In order to address the above challenges and risks, a significant number of efforts have been made to ensure these ML systems are trustworthy, especially private, safe, and fair. In this paper, we conduct the first comprehensive survey on speech-centric trustworthy ML topics related to privacy, safety, and fairness. In addition to serving as a summary report for the research community, we point out several promising future research directions to inspire the researchers who wish to explore further in this area.
translated by 谷歌翻译
商业和政府部门中自动面部识别的扩散引起了个人的严重隐私问题。解决这些隐私问题的一种方法是采用逃避攻击针对启动面部识别系统的度量嵌入网络的攻击:面部混淆系统会产生不透彻的扰动图像,从而导致面部识别系统误解用户。受扰动的面孔是在公制嵌入网络上产生的,在面部识别的背景下,这是不公平的。人口公平的问题自然而然:面部混淆系统表现是否存在人口统计学差异?我们通过对最近的面部混淆系统的分析和经验探索来回答这个问题。指标嵌入网络在人口统计学上很有意识:面部嵌入由人口统计组群聚集。我们展示了这种聚类行为如何导致少数群体面孔的面部混淆实用性减少。直观的分析模型可以深入了解这些现象。
translated by 谷歌翻译
深度学习系统needlargedatafortraining.Datasets的面部验证系统难以获得并容易出现隐私问题。由GAN等生成模型生成的合成数据可以是一个很好的选择。但是,我们表明,甘恩产生的数据容易出现偏见和公平问题。特别是在FFHQ数据集中训练的甘斯表明,在20-29岁年龄段的年龄组中产生白脸。我们还证明,当用于微调面部验证系统时,合成面部面孔会引起不同的影响,特别是针对种族属性的影响。这是使用$ dob_ {fv} $ metric测量的,该公制定义为gar@far far for face验证的标准偏差。
translated by 谷歌翻译
本文介绍了一个新颖的数据集,以帮助研究人员评估他们的计算机视觉和音频模型,以便在各种年龄,性别,表观肤色和环境照明条件下进行准确性。我们的数据集由3,011名受试者组成,并包含超过45,000个视频,平均每人15个视频。这些视频被录制在多个美国国家,各种成年人在各种年龄,性别和明显的肤色群体中。一个关键特征是每个主题同意参与他们使用的相似之处。此外,我们的年龄和性别诠释由受试者自己提供。一组训练有素的注释器使用FitzPatrick皮肤型刻度标记了受试者的表观肤色。此外,还提供了在低环境照明中记录的视频的注释。作为衡量某些属性的预测稳健性的申请,我们对DeepFake检测挑战(DFDC)的前五名获胜者提供了全面的研究。实验评估表明,获胜模型对某些特定人群的表现较小,例如肤色较深的肤色,因此可能对所有人都不概括。此外,我们还评估了最先进的明显年龄和性别分类方法。我们的实验在各种背景的人们的公平待遇方面对这些模型进行了彻底的分析。
translated by 谷歌翻译