In consequential decision-making applications, mitigating unwanted biases in machine learning models that yield systematic disadvantage to members of groups delineated by sensitive attributes such as race and gender is one key intervention to strive for equity. Focusing on demographic parity and equality of opportunity, in this paper we propose an algorithm that improves the fairness of a pre-trained classifier by simply dropping carefully selected training data points. We select instances based on their influence on the fairness metric of interest, computed using an infinitesimal jackknife-based approach. The dropping of training points is done in principle, but in practice does not require the model to be refit. Crucially, we find that such an intervention does not substantially reduce the predictive performance of the model but drastically improves the fairness metric. Through careful experiments, we evaluate the effectiveness of the proposed approach on diverse tasks and find that it consistently improves upon existing alternatives.
translated by 谷歌翻译
在结果决策中使用机器学习模型通常会加剧社会不平等,特别是对种族和性别定义的边缘化群体成员产生不同的影响。 ROC曲线(AUC)下的区域被广泛用于评估机器学习中评分功能的性能,但与其他性能指标相比,在算法公平性中进行了研究。由于AUC的成对性质,定义基于AUC的组公平度量是成对依赖性的,并且可能涉及\ emph {group}和\ emph {group} aucs。重要的是,仅考虑一种AUC类别不足以减轻AUC优化的不公平性。在本文中,我们提出了一个最小值学习和偏置缓解框架,该框架既包含组内和组间AUC,同时保持实用性。基于这个Rawlsian框架,我们设计了一种有效的随机优化算法,并证明了其收敛到最小组级AUC。我们对合成数据集和现实数据集进行了数值实验,以验证Minimax框架的有效性和所提出的优化算法。
translated by 谷歌翻译
随机梯度下降(SGDA)及其变体一直是解决最小值问题的主力。但是,与研究有差异隐私(DP)约束的经过良好研究的随机梯度下降(SGD)相反,在理解具有DP约束的SGDA的概括(实用程序)方面几乎没有工作。在本文中,我们使用算法稳定性方法在不同的设置中建立DP-SGDA的概括(实用程序)。特别是,对于凸 - 凸环设置,我们证明DP-SGDA可以在平滑和非平滑案例中都可以根据弱原始二元人群风险获得最佳的效用率。据我们所知,这是在非平滑案例中DP-SGDA的第一个已知结果。我们进一步在非convex-rong-concave环境中提供了实用性分析,这是原始人口风险的首个已知结果。即使在非私有设置中,此非convex设置的收敛和概括结果也是新的。最后,进行了数值实验,以证明DP-SGDA在凸和非凸病例中的有效性。
translated by 谷歌翻译
作为人工智能(AI)的技术子领域,可解释的AI(XAI)已经产生了广泛的算法集合,为研究人员和从业者提供了一个工具箱,用于构建XAI应用程序。凭借丰富的应用机会,解释性已经超越了数据科学家或研究人员的需求,以了解他们发展的模型,成为人们信任的重要要求,并采用部署在众多域中的AI。然而,解释性是一种本质上以人为本的财产,该领域开始接受以人为本的方法。人机互动(HCI)研究和用户体验(UX)设计在该地区的设计越来越重要。在本章中,我们从Xai算法技术景观的高级概述开始,然后选择性地调查我们自己和其他最近的HCI工作,以便以人为本的设计,评估,为Xai提供概念和方法工具。我们询问问题``以人为本的方式为Xai'做了什么,并突出了三个角色,通过帮助导航,评估和扩展Xai工具箱来塑造XAI技术的三个角色:通过用户解释性需要推动技术选择揭示现有XAI方法的缺陷,并通知新方法,为人类兼容的XAI提供概念框架。
translated by 谷歌翻译
最近,提出了不变的风险最小化(IRM)作为解决分布外(OOD)概括的有前途的解决方案。但是,目前尚不清楚何时应优先于广泛的经验风险最小化(ERM)框架。在这项工作中,我们从样本复杂性的角度分析了这两个框架,从而迈出了一个坚定的一步,以回答这个重要问题。我们发现,根据数据生成机制的类型,这两种方法可能具有有限样本和渐近行为。例如,在协变量偏移设置中,我们看到两种方法不仅达到了相同的渐近解决方案,而且具有相似的有限样本行为,没有明显的赢家。但是,对于其他分布变化,例如涉及混杂因素或反毒物变量的变化,两种方法到达不同的渐近解决方案,在这些方法中,保证IRM可以接近有限样品状态中所需的OOD溶液,而ERM甚至偶然地偏向于渐近。我们进一步研究了不同因素(环境的数量,模型的复杂性和IRM惩罚权重)如何影响IRM的样本复杂性与其距离OOD溶液的距离有关
translated by 谷歌翻译
最近的工作表明,与基督徒和印度教徒相比,在提示穆斯林的提示时,GPT-3模型有偏见的态度。两次预注册的复制尝试,一次是精确的和一个近似的尝试,在最近的GPT-3的最新指示系列版本中发现了最弱的偏差,以消除有偏见和有毒的输出。很少观察到暴力完成。然而,其他预注册的实验表明,在提示中使用与宗教相关的通用名称的暴力完成率显着增加,这也揭示了对穆斯林的二阶偏见。来自非暴力领域的穆斯林名人的名字导致了相对较少的暴力完成,这表明获得个性化信息可以使该模型无法使用刻板印象。尽管如此,内容分析揭示了宗教特定的暴力主题,其中包含高度冒犯性思想,无论及时格式如何。我们的结果表明,有必要对大语言模型进行额外的歧义,以解决高阶模式和关联。
translated by 谷歌翻译
不变的风险最小化(IRM)框架旨在从一组环境中学习不变的功能,以解决分发超出(OOD)泛化问题。底层假设是数据生成分布的因果组件在环境中仍然是常量,或者交替地,跨环境中的数据“重叠”以找到有意义的不变功能。因此,当“重叠”假设不保持时,一组真正不变的特征可能不足以以获得最佳预测性能。这种情况自然地出现在网络设置和分层数据生成模型中,其中IRM性能变为次优。为了减轻这种故障情况,我们争论部分不变性框架。关键的想法是通过基于分层差异对环境进行分区来引入IRM框架的灵活性,同时在分区内本地实施不变性。我们在分类设置中激励此框架,其中包括跨环境的因果分布。我们的结果表明,部分不变风险最小化的能力,以减轻在某些环境中的公平性和风险之间的权衡。
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
A Digital Twin (DT) is a simulation of a physical system that provides information to make decisions that add economic, social or commercial value. The behaviour of a physical system changes over time, a DT must therefore be continually updated with data from the physical systems to reflect its changing behaviour. For resource-constrained systems, updating a DT is non-trivial because of challenges such as on-board learning and the off-board data transfer. This paper presents a framework for updating data-driven DTs of resource-constrained systems geared towards system health monitoring. The proposed solution consists of: (1) an on-board system running a light-weight DT allowing the prioritisation and parsimonious transfer of data generated by the physical system; and (2) off-board robust updating of the DT and detection of anomalous behaviours. Two case studies are considered using a production gas turbine engine system to demonstrate the digital representation accuracy for real-world, time-varying physical systems.
translated by 谷歌翻译
We consider infinite horizon Markov decision processes (MDPs) with fast-slow structure, meaning that certain parts of the state space move "fast" (and in a sense, are more influential) while other parts transition more "slowly." Such structure is common in real-world problems where sequential decisions need to be made at high frequencies, yet information that varies at a slower timescale also influences the optimal policy. Examples include: (1) service allocation for a multi-class queue with (slowly varying) stochastic costs, (2) a restless multi-armed bandit with an environmental state, and (3) energy demand response, where both day-ahead and real-time prices play a role in the firm's revenue. Models that fully capture these problems often result in MDPs with large state spaces and large effective time horizons (due to frequent decisions), rendering them computationally intractable. We propose an approximate dynamic programming algorithmic framework based on the idea of "freezing" the slow states, solving a set of simpler finite-horizon MDPs (the lower-level MDPs), and applying value iteration (VI) to an auxiliary MDP that transitions on a slower timescale (the upper-level MDP). We also extend the technique to a function approximation setting, where a feature-based linear architecture is used. On the theoretical side, we analyze the regret incurred by each variant of our frozen-state approach. Finally, we give empirical evidence that the frozen-state approach generates effective policies using just a fraction of the computational cost, while illustrating that simply omitting slow states from the decision modeling is often not a viable heuristic.
translated by 谷歌翻译