最先进的文本分类器的大尺寸和复杂的决策机制使人类难以理解他们的预测,导致用户缺乏信任。这些问题导致采用Shail和集成梯度等方法来解释分类决策,通过为输入令牌分配重要性分数。然而,使用不同的随机化测试之前的工作表明,通过这些方法产生的解释可能不具有稳健性。例如,对测试集的相同预测的模型可能仍然导致不同的特征重要性排名。为了解决基于令牌的可解释性缺乏稳健性,我们探讨了句子等更高语义层面的解释。我们使用计算指标和人类主题研究来比较基于令牌的句子的解释的质量。我们的实验表明,更高级别的特征属性提供了几个优点:1)由于随机化测试测量,2)当使用近似的基于方法等诸如Shav等的方法来说,它们更加强大,并且3)它们更容易理解在语言相干性在更高的粒度水平上存在的情况下的人类。基于这些调查结果,我们表明,令牌的可解释性,同时是鉴于ML模型的输入接口的方便的首选,不是所有情况中最有效的选择。
translated by 谷歌翻译
随着机器学习(ML)模型和系统在不同行业的高赌注环境中的增加,保证了部署后的模型的性能变得至关重要。生产中的监测模型是确保其持续性能和可靠性的关键方面。我们展示了Amazon Sagemaker Model Monitor,这是一个完全托管的服务,不断监控亚马逊Sagemaker上托管的机器学习模型的质量。我们的系统实时地自动检测模型中的数据,概念,偏置和特征归因漂移,并提供警报,以便模型所有者可以采取纠正措施,从而保持高质量模型。我们描述了从客户,系统设计和架构获得的关键要求以及用于检测不同类型漂移的方法。此外,我们提供量化评估,然后使用案例,见解和从超过1.5年的生产部署中汲取的经验教训。
translated by 谷歌翻译
Automated data-driven decision making systems are increasingly being used to assist, or even replace humans in many settings. These systems function by learning from historical decisions, often taken by humans. In order to maximize the utility of these systems (or, classifiers), their training involves minimizing the errors (or, misclassifications) over the given historical data. However, it is quite possible that the optimally trained classifier makes decisions for people belonging to different social groups with different misclassification rates (e.g., misclassification rates for females are higher than for males), thereby placing these groups at an unfair disadvantage. To account for and avoid such unfairness, in this paper, we introduce a new notion of unfairness, disparate mistreatment, which is defined in terms of misclassification rates. We then propose intuitive measures of disparate mistreatment for decision boundary-based classifiers, which can be easily incorporated into their formulation as convex-concave constraints. Experiments on synthetic as well as real world datasets show that our methodology is effective at avoiding disparate mistreatment, often at a small cost in terms of accuracy.
translated by 谷歌翻译
Algorithmic decision making systems are ubiquitous across a wide variety of online as well as offline services. These systems rely on complex learning methods and vast amounts of data to optimize the service functionality, satisfaction of the end user and profitability. However, there is a growing concern that these automated decisions can lead, even in the absence of intent, to a lack of fairness, i.e., their outcomes can disproportionately hurt (or, benefit) particular groups of people sharing one or more sensitive attributes (e.g., race, sex). In this paper, we introduce a flexible mechanism to design fair classifiers by leveraging a novel intuitive measure of decision boundary (un)fairness. We instantiate this mechanism with two well-known classifiers, logistic regression and support vector machines, and show on real-world data that our mechanism allows for a fine-grained control on the degree of fairness, often at a small cost in terms of accuracy. A Python implementation of our mechanism is available at fate-computing.mpi-sws.org
translated by 谷歌翻译
当前,根据CNN处理的视频数据,主要执行动作识别。我们研究CNN的表示过程是否也可以通过将基于图像的动作音频表示为任务中的多模式动作识别。为此,我们提出了多模式的音频图像和视频动作识别器(MAIVAR),这是一个基于CNN的音频图像到视频融合模型,以视频和音频方式来实现卓越的动作识别性能。Maivar提取音频的有意义的图像表示,并将其与视频表示形式融合在一起,以获得更好的性能,与大规模动作识别数据集中的两种模式相比。
translated by 谷歌翻译
由于对高效有效的大数据分析解决方案的需求,医疗保健行业中数据分析的合并已取得了重大进展。知识图(KGS)已在该领域证明了效用,并且植根于许多医疗保健应用程序,以提供更好的数据表示和知识推断。但是,由于缺乏代表性的kg施工分类法,该指定领域中的几种现有方法不足和劣等。本文是第一个提供综合分类法和鸟类对医疗kg建筑的眼光的看法。此外,还对与各种医疗保健背景相关的学术工作中最新的技术进行了彻底的检查。这些技术是根据用于知识提取的方法,知识库和来源的类型以及合并评估协议的方法进行了严格评估的。最后,报道和讨论了文献中的一些研究发现和现有问题,为这个充满活力的地区开放了未来研究的视野。
translated by 谷歌翻译
由于人口和全球化的增加,对能源的需求大大增加。因此,准确的能源消耗预测已成为政府规划,减少能源浪费和能源管理系统稳定运行的基本先决条件。在这项工作中,我们介绍了对家庭能耗的时间序列预测的主要机器学习模型的比较分析。具体来说,我们使用WEKA(一种数据挖掘工具)首先将模型应用于Kaggle数据科学界可获得的小时和每日家庭能源消耗数据集。应用的模型是:多层感知器,K最近的邻居回归,支持向量回归,线性回归和高斯过程。其次,我们还在Python实施了时间序列预测模型Arima和Var,以预测有或没有天气数据的韩国家庭能源消耗。我们的结果表明,预测能源消耗预测的最佳方法是支持向量回归,然后是多层感知器和高斯过程回归。
translated by 谷歌翻译
部分微分方程(PDE)在许多复杂动态过程的数学建模中发挥着主导作用。解决这些PDE通常需要预定的计算成本,特别是当必须对不同的参数或条件进行多次评估时。在培训之后,神经运营商可以比传统的PDE溶剂更快地提供PDES解决方案。在这项工作中,检查两个神经运营商的不变性属性和计算复杂性,用于标量数量的运输PDE。基于图形内核网络(GKN)的神经运算符在图形结构数据上运行,以合并非识别依赖性。在这里,我们提出了改进的GKN制定以实现帧不变性。传染媒介云神经网络(VCNN)是一个具有嵌入式帧不变性的替代神经运算符,可在点云数据上运行。基于GKN的神经运营商与VCNN相比,略微更好地预测性能。然而,GKN需要过度高的计算成本,与VCNN的线性增加相比,随着越来越多的离散物对象而直角增加。
translated by 谷歌翻译
视网膜疗法代表一组视网膜疾病,如果不及时治疗,可能会导致严重的视觉障碍甚至失明。许多研究人员已经开发了自治系统,通过眼底和光学相干断层扫描(OCT)图像识别视网膜病变。然而,大多数这些框架采用传统的转移学习和微调方法,需要一种非常良好的注释训练数据来产生准确的诊断性能。本文提出了一种新型增量跨域适应仪,可以通过几次拍摄培训允许任何深度分类模型逐步学习OCT和眼底图像的异常视网膜病理。此外,与其竞争对手不同,所提出的仪器通过贝叶斯多目标函数驱动,不仅执行候选分类网络,不仅在增量培训期间保留其先前学到的知识,而且还确保网络了解先前学习的结构和语义关系病理学和新增的疾病类别在推理阶段有效地识别它们。所提出的框架,在用三种不同的扫描仪获得的六个公共数据集中评估,以筛选十三个视网膜病理,优于最先进的竞争对手,通过分别实现了0.9826和0.9846的整体准确性和F1得分。
translated by 谷歌翻译
Diabetic Retinopathy (DR) is considered one of the primary concerns due to its effect on vision loss among most people with diabetes globally. The severity of DR is mostly comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this paper, we adopt transformer-based learning models to capture the crucial features of retinal images to understand DR severity better. We work with ensembling image transformers, where we adopt four models, namely ViT (Vision Transformer), BEiT (Bidirectional Encoder representation for image Transformer), CaiT (Class-Attention in Image Transformers), and DeiT (Data efficient image Transformers), to infer the degree of DR severity from fundus photographs. For experiments, we used the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.
translated by 谷歌翻译