智能论文笔记

Knee arthritis severity measurement using deep learning: a publicly available algorithm with a multi-institutional validation showing radiologist-level performance

Hanxue Gu , Keyu Li , Roy J. Colglazier , Jichen Yang , Michael Lebhar , Jonathan O'Donnell , William A. Jiranek , Richard C. Mather , Rob J. French , Nicholas Said

分类：计算机视觉 | 机器学习

2022-03-16

膝关节X射线上的膝盖骨关节炎（KOA）的评估是使用总膝关节置换术的中心标准。但是，该评估遭受了不精确的标准，并且读取器间的可变性非常高。对KOA严重性的算法，自动评估可以通过提高其使用的适当性来改善膝盖替代程序的总体结果。我们提出了一种基于深度学习的新型五步算法，以自动从X光片后验（PA）视图对KOA进行评级：（1）图像预处理（2）使用Yolo V3-tiny模型，图像在图像中定位膝关节，（3）使用基于卷积神经网络的分类器对骨关节炎的严重程度进行初步评估，（4）关节分割和关节空间狭窄（JSN）的计算（JSN）和（5），JSN和最初的结合评估确定最终的凯尔格伦法律（KL）得分。此外，通过显示用于进行评估的分割面具，我们的算法与典型的“黑匣子”深度学习分类器相比表现出更高的透明度。我们使用我们机构的两个公共数据集和一个数据集进行了全面的评估，并表明我们的算法达到了最先进的性能。此外，我们还从机构中的多个放射科医生那里收集了评分，并表明我们的算法在放射科医生级别进行。该软件已在https://github.com/maciejmazurowowski/osteoarthitis-classification上公开提供。

translated by 谷歌翻译

Controlling Commercial Cooling Systems Using Reinforcement Learning

Jerry Luo , Cosmin Paduraru , Octavian Voicu , Yuri Chervonyi , Scott Munns , Jerry Li , Crystal Qian , Praneet Dutta , Jared Quincy Davis , Ningjia Wu

分类：机器学习 | 人工智能

2022-11-11

This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites.

translated by 谷歌翻译

Preregistered protocol for: Articulatory changes in speech following treatment for oral or oropharyngeal cancer: a systematic review

Thomas B. Tienkamp , Teja Rebernik , Defne Abur , Rob J. J. H. van Son , Sebastiaan A. H. J. de Visscher , Max J. H. Witjes , Martijn Wieling

分类：自然语言处理

2022-09-14

该文档概述了Prospero预先注册的方案，用于对口腔或口腔或肉桂癌治疗后语音变化的系统审查进行系统审查。口腔中肿瘤的治疗可能会导致生理变化，这可能导致发音困难。由于疤痕组织和/或潜在的（术后）放射治疗，舌头变得不那么流动。此外，组织损失可能会为气流或极限收缩可能性创造旁路。为了更好地了解语音问题的性质，需要有关枢纽运动的信息，因为感知信息或声学信息仅提供了间接的关节变化证据。因此，这项系统的综述将回顾研究，该研究直接测量口腔或口咽癌治疗后舌，下巴和嘴唇的关节运动。

translated by 谷歌翻译

Forecast combinations: an over 50-year review

Xiaoqian Wang , Rob J Hyndman , Feng Li , Yanfei Kang

分类： (统计)机器学习

2022-05-09

预测组合在预测社区中蓬勃发展，近年来，已经成为预测研究和活动主流的一部分。现在，由单个（目标）系列产生的多个预测组合通过整合来自不同来源收集的信息，从而提高准确性，从而减轻了识别单个“最佳”预测的风险。组合方案已从没有估计的简单组合方法演变为涉及时间变化的权重，非线性组合，组件之间的相关性和交叉学习的复杂方法。它们包括结合点预测和结合概率预测。本文提供了有关预测组合的广泛文献的最新评论，并参考可用的开源软件实施。我们讨论了各种方法的潜在和局限性，并突出了这些思想如何随着时间的推移而发展。还调查了有关预测组合实用性的一些重要问题。最后，我们以当前的研究差距和未来研究的潜在见解得出结论。

translated by 谷歌翻译

LoMEF: A Framework to Produce Local Explanations for Global Model Time Series Forecasts

Dilini Rajapaksha , Christoph Bergmeir , Rob J Hyndman

分类：机器学习 | 人工智能 | (统计)机器学习

2021-11-13

与单变量预测方法相比，在一组多个时间序列中培训的全球预测模型（GFM）在许多预测竞赛和现实世界应用方面表现出优越的结果。 ETS和Arima等统计预测模型的普及的一个方面是它们相对简单和可解释性（就相关的滞后，趋势，季节性等），而GFM通常缺乏可解释性，特别是对特定时间序列。这减少了基于预测的决策时对利益相关者的信任和信心，而不是能够理解预测。为了减轻这个问题，在这项工作中，我们提出了一种新颖的本地模型 - 不可知论解释方法来解释GFM的预测。我们培训更简单的单变量代理模型，这些模型被认为是通过自动启动或直截了当地作为时间序列的一步的全局黑匣子模型预测所获得的邻域内的邻域内的样本的可解释（例如，ETS）。需要解释哪些。之后，我们评估了对全球模型在定性和定量方面的预测的解释，例如准确性，保真度，稳定性和可理性，并且能够展示我们方法的好处。

translated by 谷歌翻译

Geometric and Physical Quantities Improve E(3) Equivariant Message Passing

Johannes Brandstetter , Rob Hesselink , Elise van der Pol , Erik J Bekkers , Max Welling

分类：机器学习 | 人工智能 | (统计)机器学习

2021-10-06

包括协调性信息，例如位置，力，速度或旋转在计算物理和化学中的许多任务中是重要的。我们介绍了概括了等级图形网络的可控e（3）的等值图形神经网络（Segnns），使得节点和边缘属性不限于不变的标量，而是可以包含相协同信息，例如矢量或张量。该模型由可操纵的MLP组成，能够在消息和更新功能中包含几何和物理信息。通过可操纵节点属性的定义，MLP提供了一种新的Activation函数，以便与可转向功能字段一般使用。我们讨论我们的镜头通过等级的非线性卷曲镜头讨论我们的相关工作，进一步允许我们引脚点点的成功组件：非线性消息聚集在经典线性（可操纵）点卷积上改善;可操纵的消息在最近发送不变性消息的最近的等价图形网络上。我们展示了我们对计算物理学和化学的若干任务的方法的有效性，并提供了广泛的消融研究。

translated by 谷歌翻译

A pragmatic approach to estimating average treatment effects from EHR data: the effect of prone positioning on mechanically ventilated COVID-19 patients

Adam Izdebski , Patrick J. Thoral , Robbert C. A. Lalisang , Dean M. McHugh , Diederik Gommers , Olaf L. Cremer , Rob J. Bosman , Sander Rigter , Evert-Jan Wils , Tim Frenzel

分类：机器学习 | 人工智能

2021-09-14

尽管近期因因果推断领域的进展，迄今为止没有关于从观察数据的收集治疗效应估算的方法。对临床实践的结果是，当缺乏随机试验的结果时，没有指导在真实情景中似乎有效的指导。本文提出了一种务实的方法，以获得从观察性研究的治疗效果的初步但稳健地估算，为前线临床医生提供对其治疗策略的信心程度。我们的研究设计适用于一个公开问题，估算Covid-19密集护理患者的拳击机动的治疗效果。

translated by 谷歌翻译

Reproducible radiomics through automated machine learning validated on twelve clinical applications

Martijn P. A. Starmans , Sebastian R. van der Voort , Thomas Phil , Milea J. M. Timbergen , Melissa Vos , Guillaume A. Padmos , Wouter Kessels , David Hanff , Dirk J. Grunhagen , Cornelis Verhoef

分类：计算机视觉

2021-08-19

放射线学使用定量医学成像特征来预测临床结果。目前，在新的临床应用中，必须通过启发式试验和纠正过程手动完成各种可用选项的最佳放射组方法。在这项研究中，我们提出了一个框架，以自动优化每个应用程序的放射线工作流程的构建。为此，我们将放射线学作为模块化工作流程，并为每个组件包含大量的常见算法。为了优化每个应用程序的工作流程，我们使用随机搜索和结合使用自动化机器学习。我们在十二个不同的临床应用中评估我们的方法，从而在曲线下导致以下区域：1）脂肪肉瘤（0.83）； 2）脱粘型纤维瘤病（0.82）; 3）原发性肝肿瘤（0.80）; 4）胃肠道肿瘤（0.77）； 5）结直肠肝转移（0.61）; 6）黑色素瘤转移（0.45）; 7）肝细胞癌（0.75）; 8）肠系膜纤维化（0.80）; 9）前列腺癌（0.72）； 10）神经胶质瘤（0.71）; 11）阿尔茨海默氏病（0.87）;和12）头颈癌（0.84）。我们表明，我们的框架具有比较人类专家的竞争性能，优于放射线基线，并且表现相似或优于贝叶斯优化和更高级的合奏方法。最后，我们的方法完全自动优化了放射线工作流的构建，从而简化了在新应用程序中对放射线生物标志物的搜索。为了促进可重复性和未来的研究，我们公开发布了六个数据集，框架的软件实施以及重现这项研究的代码。

translated by 谷歌翻译

Large Language Models as Corporate Lobbyists

John J. Nay

分类：自然语言处理

2023-01-03

We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.

translated by 谷歌翻译

Benchmarking common uncertainty estimation methods with histopathological images under domain shift and label noise

Hendrik A. Mehrtens , Alexander Kurz , Tabea-Clara Bucher , Titus J. Brinker

分类：计算机视觉 | 机器学习

2023-01-03

In the past years, deep learning has seen an increase of usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their own uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole-Slide-Images under domain shift using the H\&E stained Camelyon17 breast cancer dataset. Although it is known that histopathological data can be subject to strong domain shift and label noise, to our knowledge this is the first work that compares the most common methods for uncertainty estimation under these aspects. In our experiments, we compare Stochastic Variational Inference, Monte-Carlo Dropout, Deep Ensembles, Test-Time Data Augmentation as well as combinations thereof. We observe that ensembles of methods generally lead to higher accuracies and better calibration and that Test-Time Data Augmentation can be a promising alternative when choosing an appropriate set of augmentations. Across methods, a rejection of the most uncertain tiles leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. We observe that the border regions of the Camelyon17 dataset are subject to label noise and evaluate the robustness of the included methods against different noise levels. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.

translated by 谷歌翻译