道德是人类最长的智力努力之一。近年来,AI和NLP的领域试图撰写与学习系统的与人类相互作用的学习系统,应该被限制为行为道德。该静脉中的一个提议是建立道德模型,可以采取任意文本,并输出关于所描述的情况的道德判断。在这项工作中,我们专注于对最近提出的Delphi模型的单一案例研究,并为该项目的建议自动化道德判决提供了批评。通过对Delphi的审计,我们检查更广泛的问题,适用于任何类似的尝试。我们讨论了机器道德如何通过专注于技术的当前和近期使用技术的方式来讨论机器伦理,以透明度,民主价值观,并允许直接的责任。
translated by 谷歌翻译
Strategic test allocation plays a major role in the control of both emerging and existing pandemics (e.g., COVID-19, HIV). Widespread testing supports effective epidemic control by (1) reducing transmission via identifying cases, and (2) tracking outbreak dynamics to inform targeted interventions. However, infectious disease surveillance presents unique statistical challenges. For instance, the true outcome of interest - one's positive infectious status, is often a latent variable. In addition, presence of both network and temporal dependence reduces the data to a single observation. As testing entire populations regularly is neither efficient nor feasible, standard approaches to testing recommend simple rule-based testing strategies (e.g., symptom based, contact tracing), without taking into account individual risk. In this work, we study an adaptive sequential design involving n individuals over a period of {\tau} time-steps, which allows for unspecified dependence among individuals and across time. Our causal target parameter is the mean latent outcome we would have obtained after one time-step, if, starting at time t given the observed past, we had carried out a stochastic intervention that maximizes the outcome under a resource constraint. We propose an Online Super Learner for adaptive sequential surveillance that learns the optimal choice of tests strategies over time while adapting to the current state of the outbreak. Relying on a series of working models, the proposed method learns across samples, through time, or both: based on the underlying (unknown) structure in the data. We present an identification result for the latent outcome in terms of the observed data, and demonstrate the superior performance of the proposed strategy in a simulation modeling a residential university environment during the COVID-19 pandemic.
translated by 谷歌翻译
Reinforcement Learning (RL) algorithms are known to scale poorly to environments with many available actions, requiring numerous samples to learn an optimal policy. The traditional approach of considering the same fixed action space in every possible state implies that the agent must understand, while also learning to maximize its reward, to ignore irrelevant actions such as $\textit{inapplicable actions}$ (i.e. actions that have no effect on the environment when performed in a given state). Knowing this information can help reduce the sample complexity of RL algorithms by masking the inapplicable actions from the policy distribution to only explore actions relevant to finding an optimal policy. This is typically done in an ad-hoc manner with hand-crafted domain logic added to the RL algorithm. In this paper, we propose a more systematic approach to introduce this knowledge into the algorithm. We (i) standardize the way knowledge can be manually specified to the agent; and (ii) present a new framework to autonomously learn these state-dependent action constraints jointly with the policy. We show experimentally that learning inapplicable actions greatly improves the sample efficiency of the algorithm by providing a reliable signal to mask out irrelevant actions. Moreover, we demonstrate that thanks to the transferability of the knowledge acquired, it can be reused in other tasks to make the learning process more efficient.
translated by 谷歌翻译
Explainability has been widely stated as a cornerstone of the responsible and trustworthy use of machine learning models. With the ubiquitous use of Deep Neural Network (DNN) models expanding to risk-sensitive and safety-critical domains, many methods have been proposed to explain the decisions of these models. Recent years have also seen concerted efforts that have shown how such explanations can be distorted (attacked) by minor input perturbations. While there have been many surveys that review explainability methods themselves, there has been no effort hitherto to assimilate the different methods and metrics proposed to study the robustness of explanations of DNN models. In this work, we present a comprehensive survey of methods that study, understand, attack, and defend explanations of DNN models. We also present a detailed review of different metrics used to evaluate explanation methods, as well as describe attributional attack and defense methods. We conclude with lessons and take-aways for the community towards ensuring robust explanations of DNN model predictions.
translated by 谷歌翻译
从具有高隐私要求的领域(例如医疗干预空间)获得的真实数据较低,并且收购在法律上很复杂。因此,这项工作提供了一种以医疗服装为例为医疗环境创建合成数据集的方法。目的是缩小合成数据和真实数据之间的现实差距。为此,使用虚幻的引擎插件或Unity比较了3D扫描服装和设计服装的方法。此外,还使用了绿屏和目标域数据集的混合现实数据集。我们的实验表明,设计服装的结构性域随机化以及混合现实数据提供了基线,可在临床目标域的测试数据集上实现72.0%的地图。当使用15%可用的目标域列车数据时,针对100%(660张图像)目标域列车数据的差距几乎可以关闭80.05%的地图(81.95%地图)。最后,我们表明,当使用100%目标域训练数据时,精度可以提高到83.35%的地图。
translated by 谷歌翻译
姿势图优化是同时定位和映射问题的一种特殊情况,其中唯一要估计的变量是姿势变量,而唯一的测量值是施加间约束。绝大多数PGO技术都是基于顶点的(变量是机器人姿势),但是最近的工作以相对方式参数化了姿势图优化问题(变量是姿势之间的变换),利用最小循环基础来最大程度地提高范围的稀疏性。问题。我们以增量方式探索周期基础的构建,同时最大程度地提高稀疏性。我们验证一种算法,该算法逐渐构建稀疏循环基础,并将其性能与最小循环基础进行比较。此外,我们提出了一种算法,以近似两个图表的最小周期基础,这些图在多代理方案中常见。最后,姿势图优化的相对参数化仅限于使用SE(2)或SE(3)上的刚体变换作为姿势之间的约束。我们引入了一种方法,以允许在相对姿势图优化问题中使用低度测量值。我们对标准基准,模拟数据集和自定义硬件的算法进行了广泛的验证。
translated by 谷歌翻译
这项研究表明,预期和实际相互作用如何影响老年人的SAR量化量化。这项研究包括两个部分:在线调查,可通过视频观看SAR和接受研究的验收研究来探索预期的交互作用,其中老年人与机器人进行了互动。这项研究的两个部分均在Gymmy的帮助下完成,这是一种机器人系统,我们的实验室开发了用于培训老年人身体和认知活动的培训。两个研究部分都表现出相似的用户响应,表明用户可以通过预期的互动来预测SAR的接受。索引术语:衰老,人类机器人互动,老年人,质量评估,社会辅助机器人,技术接受,技术恐惧症,信任,用户体验。
translated by 谷歌翻译
体育活动对于健康和福祉很重要,但只有很少的人满足世界卫生组织的体育活动标准。机器人运动教练的开发可以帮助增加训练的可及性和动力。用户的接受和信任对于成功实施这种辅助机器人至关重要。这可能会受到机器人系统和机器人性能的透明度的影响,尤其是其失败。该研究对与任务,人,机器人和相互作用(T-HRI)相关的透明度水平进行了初步研究,并进行了相应调整的机器人行为。在一部分实验中,机器人性能失败允许分析与故障有关的T-HRI水平的影响。在机器人性能中遇到失败的参与者表现出比没有经历这种失败的人的接受程度和信任水平要低。此外,T-HRI级别和参与者群体之间的接受度量存在差异,这暗示了未来研究的几个方向。
translated by 谷歌翻译
我们研究了改进的多臂匪徒(IMAB)问题,其中从手臂获得的奖励随着收到的拉力数量而增加。该模型为教育和就业等领域中的许多现实世界问题提供了优雅的抽象,在这种领域中,关于机会分配的决定可能会影响社区的未来能力以及它们之间的差异。在这种情况下,决策者必须考虑她的决策对未来奖励的影响,除了随时最大化其累积奖励的标准目标。在许多这些应用中,决策者的时间范围未知,这激发了在技术上更具挑战性的地平线环境中对IMAB问题的研究。我们研究了地平线 - 统一环境中两个看似相互冲突的目标之间产生的紧张:a)根据武器的当前奖励,在任何时候最大化累积奖励,b)确保具有更好的长期奖励的武器获得足够的机会即使他们最初的奖励很低。我们表明,令人惊讶的是,在这种情况下,这两个目标是相互对齐的。我们的主要贡献是对IMAB问题的任何时间算法,它可以获得最佳的累积奖励,同时确保武器在足够的时间内发挥其真正的潜力。由于缺乏机会,我们的算法减轻了最初的差异,并继续拉动手臂直到停止改善。我们通过证明a)imab问题的任何算法来证明我们的算法的最佳性,无论其功利主义,无论多么有效,都必须遭受$ \ omega(t)$政策后悔和$ \ omega(k)$竞争比率相对于最佳的比例离线政策和b)我们算法的竞争比率为$ O(k)$。
translated by 谷歌翻译
本文的重点是概念证明,机器学习(ML)管道,该管道从低功率边缘设备上获取的压力传感器数据中提取心率。 ML管道包括一个UPS采样器神经网络,信号质量分类器以及优化的1D横向扭转神经网络,以高效且准确的心率估计。这些型号的设计使管道小于40 kb。此外,开发了由UPS采样器和分类器组成的杂种管道,然后开发了峰值检测算法。管道部署在ESP32边缘设备上,并针对信号处理进行基准测试,以确定能量使用和推理时间。结果表明,与传统算法相比,提出的ML和杂种管道将能量和时间减少82%和28%。 ML管道的主要权衡是准确性,平均绝对误差(MAE)为3.28,而混合动力车和信号处理管道为2.39和1.17。因此,ML模型显示出在能源和计算约束设备中部署的希望。此外,ML管道的较低采样率和计算要求可以使自定义硬件解决方案降低可穿戴设备的成本和能源需求。
translated by 谷歌翻译