随着计算机视觉应用程序的最新增长,尚未探索它们的公平和公正性问题。有大量证据表明,训练数据中存在的偏差反映在模型中,甚至放大。图像数据集的许多以前的方法偏见,包括基于增强数据集的模型,在计算上实现的计算昂贵。在这项研究中,我们提出了一个快速有效的模型,以通过重建并最大程度地减少预期变量之间的统计依赖性来消除图像数据集。我们的体系结构包括重建图像的U-NET,并结合了预先训练的分类器,该分类器会惩罚目标属性和受保护属性之间的统计依赖性。我们在Celeba数据集上评估了我们提出的模型,将结果与最先进的偏见方法进行比较,并证明该模型实现了有希望的公平性 - 精确性组合。
translated by 谷歌翻译
The data used to train deep neural network (DNN) models in applications such as healthcare and finance typically contain sensitive information. A DNN model may suffer from overfitting. Overfitted models have been shown to be susceptible to query-based attacks such as membership inference attacks (MIAs). MIAs aim to determine whether a sample belongs to the dataset used to train a classifier (members) or not (nonmembers). Recently, a new class of label based MIAs (LAB MIAs) was proposed, where an adversary was only required to have knowledge of predicted labels of samples. Developing a defense against an adversary carrying out a LAB MIA on DNN models that cannot be retrained remains an open problem. We present LDL, a light weight defense against LAB MIAs. LDL works by constructing a high-dimensional sphere around queried samples such that the model decision is unchanged for (noisy) variants of the sample within the sphere. This sphere of label-invariance creates ambiguity and prevents a querying adversary from correctly determining whether a sample is a member or a nonmember. We analytically characterize the success rate of an adversary carrying out a LAB MIA when LDL is deployed, and show that the formulation is consistent with experimental observations. We evaluate LDL on seven datasets -- CIFAR-10, CIFAR-100, GTSRB, Face, Purchase, Location, and Texas -- with varying sizes of training data. All of these datasets have been used by SOTA LAB MIAs. Our experiments demonstrate that LDL reduces the success rate of an adversary carrying out a LAB MIA in each case. We empirically compare LDL with defenses against LAB MIAs that require retraining of DNN models, and show that LDL performs favorably despite not needing to retrain the DNNs.
translated by 谷歌翻译
消费者和生产商参与需求响应计划的智能电网有所增加,从而降低了电力系统的投资和运营成本。同样,随着可再生能源的出现,电力市场变得越来越复杂和不可预测。为了有效地实施需求响应计划,预测电力的未来价格对于电力市场的生产商至关重要。电价非常波动,在各种因素的影响下发生变化,例如温度,风速,降雨,商业和日常活动的强度等。因此,将影响因素视为因变量可以提高预测的准确性。在本文中,根据门控复发单元提出了电价预测模型。电负荷消耗被认为是该模型中的输入变量。电价中的噪音严重降低了分析的效率和有效性。因此,将自适应降噪的减少器集成到模型中以减少降噪。然后,SAE用于从售电价中提取功能。最后,馈入GRU以训练预测变量。实际数据集上的结果表明,所提出的方法可以在预测电价方面有效地执行。
translated by 谷歌翻译
野外的机器学习模型已被证明在训练过程中容易受到特洛伊木马攻击的影响。尽管已经提出了许多检测机制,但已证明强大的适应性攻击者对他们有效。在本文中,我们旨在回答考虑一个聪明和适应性对手的问题:(i)强大的攻击者将木马所需的最小实例数量是多少? (ii)这样的攻击者是否有可能绕过强大的检测机制?我们提供了这种模型中发生的对抗和检测机制之间的对抗能力和战略相互作用的分析表征。我们根据输入数据集的分数来表征对手的能力,该输入数据集的分数可以嵌入特洛伊木马触发器。我们表明,损耗函数具有一个集中结构,该结构导致设计有效的算法,以确定这一部分,并在最优性方面可证明的界限。我们提出了一种子模型特洛伊算法,以确定样品的最小分数,以注入特洛伊木马触发器。为了逃避对木马模型的检测,我们将对手和特洛伊木马检测机制之间的战略相互作用建模为两人游戏。我们表明,对手以概率赢得了游戏,从而绕开了检测。我们通过证明特洛伊木马模型和干净模型的输出概率分布在遵循Min-Max(MM)Trojan算法时相同。我们对MNIST,CIFAR-10和EUROSAT数据集进行了广泛的评估。结果表明,(i)使用subsodular trojan算法,对手需要将特洛伊木马扳机嵌入很少的样品中,以在Trojan和干净的样品上获得高精度,以及(ii)MM Trojan算法会产生训练有素的经训练的Trojan以概率1逃避检测的模型。
translated by 谷歌翻译