广泛认为,面部识别准确性存在“性别差距”,女性具有较高的错误匹配和错误的非匹配率。但是,关于这种性别差距的原因,相对较少了解。甚至最近有关人口影响的NIST报告也列出了“我们没有做的事情”下的“分析因果”。我们首先证明女性和男性发型具有影响面部识别准确性的重要差异。特别是,与女性相比,男性面部毛发有助于在不同男性面孔之间产生更大的外观平均差异。然后,我们证明,当用来估计识别精度的数据在性别之间保持平衡,以使发型如何阻塞面部时,最初观察到的性别差距在准确性上大大消失。我们为两个不同的匹配者展示了这一结果,并分析了白种人和非裔美国人的图像。这些结果表明,对准确性的人口统计学差异的未来研究应包括检查测试数据的平衡质量,作为问题制定的一部分。为了促进可重复的研究,将公开使用此研究中使用的匹配项,属性分类器和数据集。
translated by 谷歌翻译
媒体报道指责人们对“偏见”',“”性别歧视“和”种族主义“的人士指责。研究文献中有共识,面部识别准确性为女性较低,妇女通常具有更高的假匹配率和更高的假非匹配率。然而,几乎没有出版的研究,旨在识别女性准确性较低的原因。例如,2019年的面部识别供应商测试将在广泛的算法和数据集中记录较低的女性准确性,并且数据集也列出了“分析原因和效果”在“我们没有做的东西”下''。我们介绍了第一个实验分析,以确定在去以前研究的数据集上对女性的较低人脸识别准确性的主要原因。在测试图像中控制相等的可见面部可见面积减轻了女性的表观更高的假非匹配率。其他分析表明,化妆平衡数据集进一步改善了女性以实现较低的虚假非匹配率。最后,聚类实验表明,两种不同女性的图像本质上比两种不同的男性更相似,潜在地占错误匹配速率的差异。
translated by 谷歌翻译
Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.
translated by 谷歌翻译
Point-of-Care Ultrasound (POCUS) refers to clinician-performed and interpreted ultrasonography at the patient's bedside. Interpreting these images requires a high level of expertise, which may not be available during emergencies. In this paper, we support POCUS by developing classifiers that can aid medical professionals by diagnosing whether or not a patient has pneumothorax. We decomposed the task into multiple steps, using YOLOv4 to extract relevant regions of the video and a 3D sparse coding model to represent video features. Given the difficulty in acquiring positive training videos, we trained a small-data classifier with a maximum of 15 positive and 32 negative examples. To counteract this limitation, we leveraged subject matter expert (SME) knowledge to limit the hypothesis space, thus reducing the cost of data collection. We present results using two lung ultrasound datasets and demonstrate that our model is capable of achieving performance on par with SMEs in pneumothorax identification. We then developed an iOS application that runs our full system in less than 4 seconds on an iPad Pro, and less than 8 seconds on an iPhone 13 Pro, labeling key regions in the lung sonogram to provide interpretable diagnoses.
translated by 谷歌翻译
Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit ($1$ for ReLU activation, $0$ for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.
translated by 谷歌翻译
Recent research in clustering face embeddings has found that unsupervised, shallow, heuristic-based methods -- including $k$-means and hierarchical agglomerative clustering -- underperform supervised, deep, inductive methods. While the reported improvements are indeed impressive, experiments are mostly limited to face datasets, where the clustered embeddings are highly discriminative or well-separated by class (Recall@1 above 90% and often nearing ceiling), and the experimental methodology seemingly favors the deep methods. We conduct a large-scale empirical study of 17 clustering methods across three datasets and obtain several robust findings. Notably, deep methods are surprisingly fragile for embeddings with more uncertainty, where they match or even perform worse than shallow, heuristic-based methods. When embeddings are highly discriminative, deep methods do outperform the baselines, consistent with past results, but the margin between methods is much smaller than previously reported. We believe our benchmarks broaden the scope of supervised clustering methods beyond the face domain and can serve as a foundation on which these methods could be improved. To enable reproducibility, we include all necessary details in the appendices, and plan to release the code.
translated by 谷歌翻译
Although prediction models for delirium, a commonly occurring condition during general hospitalization or post-surgery, have not gained huge popularity, their algorithmic bias evaluation is crucial due to the existing association between social determinants of health and delirium risk. In this context, using MIMIC-III and another academic hospital dataset, we present some initial experimental evidence showing how sociodemographic features such as sex and race can impact the model performance across subgroups. With this work, our intent is to initiate a discussion about the intersectionality effects of old age, race and socioeconomic factors on the early-stage detection and prevention of delirium using ML.
translated by 谷歌翻译
在许多情况下,更简单的模型比更复杂的模型更可取,并且该模型复杂性的控制是机器学习中许多方法的目标,例如正则化,高参数调整和体系结构设计。在深度学习中,很难理解复杂性控制的潜在机制,因为许多传统措施并不适合深度神经网络。在这里,我们开发了几何复杂性的概念,该概念是使用离散的dirichlet能量计算的模型函数变异性的量度。使用理论论据和经验结果的结合,我们表明,许多常见的训练启发式方法,例如参数规范正规化,光谱规范正则化,平稳性正则化,隐式梯度正则化,噪声正则化和参数初始化的选择,都可以控制几何学复杂性,并提供一个统一的框架,以表征深度学习模型的行为。
translated by 谷歌翻译
先前的工作表明,单词在语音维度上是超级定义的,这些语音将它们与最小对竞争者区分开来。该现象已称为对比度超颗粒(CH)。我们提出了语音发作时间(fot)计划的动态神经场(DNF)模型,该模型从最小对竞争者的抑制作用中得出了CH。我们通过一项新的实验来测试模型的一些预测,该实验研究了伪金中无声的停止辅音CH。结果证明了伪造中的CH效应,这与实时计划和语音生产的效果的基础一致。与CH相比,用真实的词降低了伪金中CH的范围和大小,这与词汇和语音计划之间的互动激活的作用一致。我们讨论了模型统一一组明显不同现象的潜力,从CH到语音邻域效应到语音误差中的语音痕量效应。
translated by 谷歌翻译
跌倒是致命和非致命伤害的主要原因,尤其是对于老年人。身体内部原因(例如疾病)或外部原因(例如主动或被动扰动)可能导致不平衡。主动扰动是将外力施加到人的结果,而被动扰动是由于人类运动与静态障碍相互作用而导致的。这项工作提出了一个指标,该指标允许监视躯干及其与主动和被动扰动的相关性。我们表明,躯干摇摆的巨大变化可以与主动扰动密切相关。我们还表明,通过调节过去的轨迹,躯干运动和周围场景的预期路径和躯干摇摆,我们可以合理地预测躯干摇摆的未来路径和预期变化。这将有直接的预防应用程序。结果表明,躯干摇摆与扰动密切相关。而且我们的模型能够利用全景图中介绍的视觉提示并相应地调节预测。
translated by 谷歌翻译