智能论文笔记

Federated Causal Inference in Heterogeneous Observational Data

Ruoxuan Xiong , Allison Koenecke , Michael Powell , Zhu Shen , Joshua T. Vogelstein , Susan Athey

分类：机器学习

2021-07-25

We are interested in estimating the effect of a treatment applied to individuals at multiple sites, where data is stored locally for each site. Due to privacy constraints, individual-level data cannot be shared across sites; the sites may also have heterogeneous populations and treatment assignment mechanisms. Motivated by these considerations, we develop federated methods to draw inference on the average treatment effects of combined data across sites. Our methods first compute summary statistics locally using propensity scores and then aggregate these statistics across sites to obtain point and variance estimators of average treatment effects. We show that these estimators are consistent and asymptotically normal. To achieve these asymptotic properties, we find that the aggregation schemes need to account for the heterogeneity in treatment assignments and in outcomes across sites. We demonstrate the validity of our federated methods through a comparative study of two large medical claims databases.

translated by 谷歌翻译

Optimal Experimental Design for Staggered Rollouts

Ruoxuan Xiong , Susan Athey , Mohsen Bayati , Guido Imbens

分类： (统计)机器学习

2019-11-09

在本文中，我们研究了在一组单位上进行的设计实验的问题，例如在线市场中的用户或用户组，以多个时间段，例如数周或数月。这些实验特别有助于研究对当前和未来结果具有因果影响的治疗（瞬时和滞后的影响）。设计问题涉及在实验之前或期间选择每个单元的治疗时间，以便最精确地估计瞬间和滞后的效果，实验后。这种治疗决策的优化可以通过降低其样本尺寸要求，直接最小化实验的机会成本。优化是我们提供近最优解的NP-Hard整数程序，当时在开始时进行设计决策（固定样本大小设计）。接下来，我们研究允许在实验期间进行适应性决策的顺序实验，并且还可能早期停止实验，进一步降低其成本。然而，这些实验的顺序性质使设计阶段和估计阶段复杂化。我们提出了一种新的算法，PGAE，通过自适应地制造治疗决策，估算治疗效果和绘制有效的实验后推理来解决这些挑战。 PGAE将来自贝叶斯统计，动态编程和样品分裂的思想结合起来。使用来自多个域的真实数据集的合成实验，我们证明了与基准相比，我们的固定样本尺寸和顺序实验的提出解决方案将实验的机会成本降低了50％和70％。

translated by 谷歌翻译

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Stefan Wager , Susan Athey

分类：

2015-10-14

Many scientific and engineering challenges-ranging from personalized medicine to customized marketing recommendations-require an understanding of treatment effect heterogeneity. In this paper, we develop a non-parametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect, and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.

translated by 谷歌翻译

HACA3: A Unified Approach for Multi-site MR Image Harmonization

Lianrui Zuo , Yihao Liu , Yuan Xue , Blake E. Dewey , Murat Bilgel , Ellen M. Mowry , Scott D. Newsome , Peter A. Calabresi , Susan M. Resnick , Jerry L. Prince

分类：计算机视觉

2022-12-12

The lack of standardization is a prominent issue in magnetic resonance (MR) imaging. This often causes undesired contrast variations due to differences in hardware and acquisition parameters. In recent years, MR harmonization using image synthesis with disentanglement has been proposed to compensate for the undesired contrast variations. Despite the success of existing methods, we argue that three major improvements can be made. First, most existing methods are built upon the assumption that multi-contrast MR images of the same subject share the same anatomy. This assumption is questionable since different MR contrasts are specialized to highlight different anatomical features. Second, these methods often require a fixed set of MR contrasts for training (e.g., both Tw-weighted and T2-weighted images must be available), which limits their applicability. Third, existing methods generally are sensitive to imaging artifacts. In this paper, we present a novel approach, Harmonization with Attention-based Contrast, Anatomy, and Artifact Awareness (HACA3), to address these three issues. We first propose an anatomy fusion module that enables HACA3 to respect the anatomical differences between MR contrasts. HACA3 is also robust to imaging artifacts and can be trained and applied to any set of MR contrasts. Experiments show that HACA3 achieves state-of-the-art performance under multiple image quality metrics. We also demonstrate the applicability of HACA3 on downstream tasks with diverse MR datasets acquired from 21 sites with different field strengths, scanner platforms, and acquisition protocols.

translated by 谷歌翻译

A perspective on physical reservoir computing with nanomagnetic devices

Dan A Allwood , Matthew O A Ellis , David Griffin , Thomas J Hayward , Luca Manneschi , Mohammad F KH Musameh , Simon O'Keefe , Susan Stepney , Charles Swindells , Martin A Trefzer

分类：机器学习

2022-12-09

Neural networks have revolutionized the area of artificial intelligence and introduced transformative applications to almost every scientific field and industry. However, this success comes at a great price; the energy requirements for training advanced models are unsustainable. One promising way to address this pressing issue is by developing low-energy neuromorphic hardware that directly supports the algorithm's requirements. The intrinsic non-volatility, non-linearity, and memory of spintronic devices make them appealing candidates for neuromorphic devices. Here we focus on the reservoir computing paradigm, a recurrent network with a simple training algorithm suitable for computation with spintronic devices since they can provide the properties of non-linearity and memory. We review technologies and methods for developing neuromorphic spintronic devices and conclude with critical open issues to address before such devices become widely used.

translated by 谷歌翻译

Beyond Discrete Genres: Mapping News Items onto a Multidimensional Framework of Genre Cues

Zilin Lin , Kasper Welbers , Susan Vermeer , Damian Trilling

分类：自然语言处理

2022-12-08

In the contemporary media landscape, with the vast and diverse supply of news, it is increasingly challenging to study such an enormous amount of items without a standardized framework. Although attempts have been made to organize and compare news items on the basis of news values, news genres receive little attention, especially the genres in a news consumer's perception. Yet, perceived news genres serve as an essential component in exploring how news has developed, as well as a precondition for understanding media effects. We approach this concept by conceptualizing and operationalizing a non-discrete framework for mapping news items in terms of genre cues. As a starting point, we propose a preliminary set of dimensions consisting of "factuality" and "formality". To automatically analyze a large amount of news items, we deliver two computational models for predicting news sentences in terms of the said two dimensions. Such predictions could then be used for locating news items within our framework. This proposed approach that positions news items upon a multidimensional grid helps in deepening our insight into the evolving nature of news genres.

translated by 谷歌翻译

FedUKD: Federated UNet Model with Knowledge Distillation for Land Use Classification from Satellite and Street Views

Renuga Kanagavelu , Kinshuk Dua , Pratik Garai , Susan Elias , Neha Thomas , Simon Elias , Qingsong Wei , Goh Siow Mong Rick , Liu Yong

分类：计算机视觉 | 机器学习

2022-12-05

Federated Deep Learning frameworks can be used strategically to monitor Land Use locally and infer environmental impacts globally. Distributed data from across the world would be needed to build a global model for Land Use classification. The need for a Federated approach in this application domain would be to avoid transfer of data from distributed locations and save network bandwidth to reduce communication cost. We use a Federated UNet model for Semantic Segmentation of satellite and street view images. The novelty of the proposed architecture is the integration of Knowledge Distillation to reduce communication cost and response time. The accuracy obtained was above 95% and we also brought in a significant model compression to over 17 times and 62 times for street View and satellite images respectively. Our proposed framework has the potential to be a game-changer in real-time tracking of climate change across the planet.

translated by 谷歌翻译

Modeling Mobile Health Users as Reinforcement Learning Agents

Eura Shin , Siddharth Swaroop , Weiwei Pan , Susan Murphy , Finale Doshi-Velez

分类：机器学习 | 人工智能

2022-12-01

Mobile health (mHealth) technologies empower patients to adopt/maintain healthy behaviors in their daily lives, by providing interventions (e.g. push notifications) tailored to the user's needs. In these settings, without intervention, human decision making may be impaired (e.g. valuing near term pleasure over own long term goals). In this work, we formalize this relationship with a framework in which the user optimizes a (potentially impaired) Markov Decision Process (MDP) and the mHealth agent intervenes on the user's MDP parameters. We show that different types of impairments imply different types of optimal intervention. We also provide analytical and empirical explorations of these differences.

translated by 谷歌翻译

Doubly robust nearest neighbors in factor models

Raaz Dwivedi , Katherine Tian , Sabina Tomkins , Predrag Klasnja , Susan Murphy , Devavrat Shah

分类： (统计)机器学习 | 机器学习

2022-11-25

In this technical note, we introduce an improved variant of nearest neighbors for counterfactual inference in panel data settings where multiple units are assigned multiple treatments over multiple time points, each sampled with constant probabilities. We call this estimator a doubly robust nearest neighbor estimator and provide a high probability non-asymptotic error bound for the mean parameter corresponding to each unit at each time. Our guarantee shows that the doubly robust estimator provides a (near-)quadratic improvement in the error compared to nearest neighbor estimators analyzed in prior work for these settings.

translated by 谷歌翻译

Deep Learning Based Detection of Enlarged Perivascular Spaces on Brain MRI

Tanweer Rashid , Hangfan Liu , Jeffrey B. Ware , Karl Li , Jose Rafael Romero , Elyas Fadaee , Ilya M. Nasrallah , Saima Hilal , R. Nick Bryan , Timothy M. Hughes

分类：计算机视觉 | 机器学习

2022-09-27

深度学习已在许多神经影像应用中有效。但是，在许多情况下，捕获与小血管疾病有关的信息的成像序列的数量不足以支持数据驱动的技术。此外，基于队列的研究可能并不总是具有用于准确病变检测的最佳或必需成像序列。因此，有必要确定哪些成像序列对于准确检测至关重要。在这项研究中，我们旨在找到磁共振成像（MRI）序列的最佳组合，以深入基于学习的肿瘤周围空间（EPV）。为此，我们实施了一个有效的轻巧U-NET，适用于EPVS检测，并全面研究了来自易感加权成像（SWI），流体侵入的反转恢复（FLAIR），T1加权（T1W）和T2的不同信息组合 - 加权（T2W）MRI序列。我们得出的结论是，T2W MRI对于准确的EPV检测最为重要，并且在深神经网络中掺入SWI，FLAIR和T1W MRI可能会使精度的提高无关。

translated by 谷歌翻译