智能论文笔记

Causal Feature Selection via Orthogonal Search

Ashkan Soleymani , Anant Raj , Stefan Bauer , Bernhard Schölkopf , Michel Besserve

分类： (统计)机器学习 | 机器学习

2020-07-06

在许多学科中，在大量解释变量中推断反应变量的直接因果父母的问题具有很高的实际意义。但是，建立的方法通常至少会随着解释变量的数量而呈指数级扩展，难以扩展到非线性关系，并且很难扩展到周期性数据。受{\ em Debiased}机器学习方法的启发，我们研究了一种单Vs.-the-Rest特征选择方法，以发现响应的直接因果父母。我们提出了一种用于纯观测数据的算法，同时还提供理论保证，包括可能在周期存在下的部分非线性关系的情况。由于它仅需要对每个变量进行一个估计，因此我们的方法甚至适用于大图。与既定方法相比，我们证明了显着改善。

translated by 谷歌翻译

Instrumental Variables in Causal Inference and Machine Learning: A Survey

Anpeng Wu , Kun Kuang , Ruoxuan Xiong , Fei Wu

分类：机器学习 | 人工智能

2022-12-12

Causal inference is the process of using assumptions, study designs, and estimation strategies to draw conclusions about the causal relationships between variables based on data. This allows researchers to better understand the underlying mechanisms at work in complex systems and make more informed decisions. In many settings, we may not fully observe all the confounders that affect both the treatment and outcome variables, complicating the estimation of causal effects. To address this problem, a growing literature in both causal inference and machine learning proposes to use Instrumental Variables (IV). This paper serves as the first effort to systematically and comprehensively introduce and discuss the IV methods and their applications in both causal inference and machine learning. First, we provide the formal definition of IVs and discuss the identification problem of IV regression methods under different assumptions. Second, we categorize the existing work on IV methods into three streams according to the focus on the proposed methods, including two-stage least squares with IVs, control function with IVs, and evaluation of IVs. For each stream, we present both the classical causal inference methods, and recent developments in the machine learning literature. Then, we introduce a variety of applications of IV methods in real-world scenarios and provide a summary of the available datasets and algorithms. Finally, we summarize the literature, discuss the open problems and suggest promising future research directions for IV methods and their applications. We also develop a toolkit of IVs methods reviewed in this survey at https://github.com/causal-machine-learning-lab/mliv.

translated by 谷歌翻译

Causal discovery under a confounder blanket

David S. Watson , Ricardo Silva

分类：人工智能 | (统计)机器学习

2022-05-11

从观察数据中推断出因果关系很少直接，但是在高维度中，问题尤其困难。对于这些应用，因果发现算法通常需要参数限制或极端稀疏限制。我们放松这些假设，并专注于一个重要但更专业的问题，即在已知的变量子中恢复因果秩序，这些变量已知会从某些（可能很大的）混杂的协变量（即$ \ textit {Confounder Blanset} $）中降下。这在许多环境中很有用，例如，在研究具有背景信息的遗传数据的动态生物分子子系统时。在一个称为$ \ textit {混杂的毯子原理} $的结构假设下，我们认为这对于在高维度中的可拖动因果发现至关重要，我们的方法可容纳低或高稀疏性的图形，同时保持多项式时间复杂性。我们提出了一种结构学习算法，相对于所谓的$ \ textit {Lazy Oracle} $，该算法是合理且完整的。我们设计了线性和非线性系统有限样本误差控制的推理过程，并在一系列模拟和现实世界数据集上演示了我们的方法。随附的$ \ texttt {r} $ package，$ \ texttt {cbl} $可从$ \ texttt {cran} $获得。

translated by 谷歌翻译

Deep End-to-end Causal Inference

Tomas Geffner , Javier Antoran , Adam Foster , Wenbo Gong , Chao Ma , Emre Kiciman , Amit Sharma , Angus Lamb , Martin Kukla , Nick Pawlowski

分类： (统计)机器学习 | 机器学习

2022-02-04

因果推断对于跨业务参与，医疗和政策制定等领域的数据驱动决策至关重要。然而，关于因果发现的研究已经与推理方法分开发展，从而阻止了两个领域方法的直接组合。在这项工作中，我们开发了深层端到端因果推理（DECI），这是一种基于流动的非线性添加噪声模型，该模型具有观察数据，并且可以执行因果发现和推理，包括有条件的平均治疗效果（CATE））估计。我们提供了理论上的保证，即DECI可以根据标准因果发现假设恢复地面真实因果图。受应用影响的激励，我们将该模型扩展到具有缺失值的异质，混合型数据，从而允许连续和离散的治疗决策。我们的结果表明，与因果发现的相关基线相比，DECI的竞争性能和（c）在合成数据集和因果机器学习基准测试基准的一千多个实验中，跨数据类型和缺失水平进行了估计。

translated by 谷歌翻译

A Causal Research Pipeline and Tutorial for Psychologists and Social Scientists

Matthew J. Vowels

分类： (统计)机器学习

2022-06-10

因果关系是理解世界的科学努力的基本组成部分。不幸的是，在心理学和社会科学中，因果关系仍然是禁忌。由于越来越多的建议采用因果方法进行研究的重要性，我们重新制定了心理学研究方法的典型方法，以使不可避免的因果理论与其余的研究渠道协调。我们提出了一个新的过程，该过程始于从因果发现和机器学习的融合中纳入技术的发展，验证和透明的理论形式规范。然后，我们提出将完全指定的理论模型的复杂性降低到与给定目标假设相关的基本子模型中的方法。从这里，我们确定利息量是否可以从数据中估算出来，如果是的，则建议使用半参数机器学习方法来估计因果关系。总体目标是介绍新的研究管道，该管道可以（a）促进与测试因果理论的愿望兼容的科学询问（b）鼓励我们的理论透明代表作为明确的数学对象，（c）将我们的统计模型绑定到我们的统计模型中该理论的特定属性，因此减少了理论到模型间隙通常引起的规范不足问题，以及（d）产生因果关系和可重复性的结果和估计。通过具有现实世界数据的教学示例来证明该过程，我们以摘要和讨论来结论。

translated by 谷歌翻译

Characterization and Greedy Learning of Gaussian Structural Causal Models under Unknown Interventions

Juan L. Gamella , Armeen Taeb , Christina Heinze-Deml , Peter Bühlmann

分类： (统计)机器学习

2022-11-27

We consider the problem of recovering the causal structure underlying observations from different experimental conditions when the targets of the interventions in each experiment are unknown. We assume a linear structural causal model with additive Gaussian noise and consider interventions that perturb their targets while maintaining the causal relationships in the system. Different models may entail the same distributions, offering competing causal explanations for the given observations. We fully characterize this equivalence class and offer identifiability results, which we use to derive a greedy algorithm called GnIES to recover the equivalence class of the data-generating model without knowledge of the intervention targets. In addition, we develop a novel procedure to generate semi-synthetic data sets with known causal ground truth but distributions closely resembling those of a real data set of choice. We leverage this procedure and evaluate the performance of GnIES on synthetic, real, and semi-synthetic data sets. Despite the strong Gaussian distributional assumption, GnIES is robust to an array of model violations and competitive in recovering the causal structure in small- to large-sample settings. We provide, in the Python packages "gnies" and "sempler", implementations of GnIES and our semi-synthetic data generation procedure.

translated by 谷歌翻译

Semiparametric Inference For Causal Effects In Graphical Models With Hidden Variables

Rohit Bhattacharya , Razieh Nabi , Ilya Shpitser

分类： (统计)机器学习 | 机器学习

2020-03-27

研究了与隐藏变量有关的非循环图（DAG）相关的因果模型中因果效应的识别理论。然而，由于估计它们输出的识别功能的复杂性，因此未耗尽相应的算法。在这项工作中，我们弥合了识别和估算涉及单一治疗和单一结果的人口水平因果效应之间的差距。我们派生了基于功能的估计，在大类隐藏变量DAG中表现出对所识别的效果的双重稳健性，其中治疗满足简单的图形标准;该类包括模型，产生调整和前门功能作为特殊情况。我们还提供必要的和充分条件，其中隐藏变量DAG的统计模型是非分子饱和的，并且意味着对观察到的数据分布没有平等约束。此外，我们推导了一类重要的隐藏变量DAG，这意味着观察到观察到的数据分布等同于完全观察到的DAG等同于（最高的相等约束）。在这些DAG类中，我们推出了实现兴趣目标的半导体效率界限的估计估计值，该估计是治疗满足我们的图形标准的感兴趣的目标。最后，我们提供了一种完整的识别算法，可直接产生基于权重的估计策略，以了解隐藏可变因果模型中的任何可识别效果。

translated by 谷歌翻译

Structural Agnostic Modeling: Adversarial Learning of Causal Graphs

Diviyan Kalainathan , Olivier Goudet , Isabelle Guyon , David Lopez-Paz , Michèle Sebag

分类： (统计)机器学习

2018-03-13

本文提出了一种新的因果发现方法，即结构不可知的建模（SAM）。SAM利用条件独立性和分布不对称性，旨在从观察数据中找到潜在的因果结构。该方法基于不同玩家之间的游戏，该游戏将每个变量分布有条件地作为神经网估算，而对手则旨在区分生成的数据与原始数据。结合分布估计，稀疏性和无环限制的学习标准用于通过随机梯度下降来实施图形结构和参数的优化。SAM在合成和真实数据上进行了实验验证。

translated by 谷歌翻译

Causal Discovery in Linear Structural Causal Models with Deterministic Relations

Yuqin Yang , Mohamed Nafea , AmirEmad Ghassami , Negar Kiyavash

分类：机器学习 | 人工智能 | (统计)机器学习

2021-10-30

Linear structural causal models (SCMs)-- in which each observed variable is generated by a subset of the other observed variables as well as a subset of the exogenous sources-- are pervasive in causal inference and casual discovery. However, for the task of causal discovery, existing work almost exclusively focus on the submodel where each observed variable is associated with a distinct source with non-zero variance. This results in the restriction that no observed variable can deterministically depend on other observed variables or latent confounders. In this paper, we extend the results on structure learning by focusing on a subclass of linear SCMs which do not have this property, i.e., models in which observed variables can be causally affected by any subset of the sources, and are allowed to be a deterministic function of other observed variables or latent confounders. This allows for a more realistic modeling of influence or information propagation in systems. We focus on the task of causal discovery form observational data generated from a member of this subclass. We derive a set of necessary and sufficient conditions for unique identifiability of the causal structure. To the best of our knowledge, this is the first work that gives identifiability results for causal discovery under both latent confounding and deterministic relationships. Further, we propose an algorithm for recovering the underlying causal structure when the aforementioned conditions are satisfied. We validate our theoretical results both on synthetic and real datasets.

translated by 谷歌翻译

Feature selection in stratification estimators of causal effects: lessons from potential outcomes, causal diagrams, and structural equations

P. Richard Hahn , Andrew Herren

分类： (统计)机器学习

2022-09-23

估计平均因果效应的理想回归（如果有）是什么？我们在离散协变量的设置中研究了这个问题，从而得出了各种分层估计器的有限样本方差的表达式。这种方法阐明了许多广泛引用的结果的基本统计现象。我们的博览会结合了研究因果效应估计的三种不同的方法论传统的见解：潜在结果，因果图和具有加性误差的结构模型。

translated by 谷歌翻译

The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time

Raj Agrawal , Tamara Broderick

分类： (统计)机器学习

2021-06-23

Many scientific problems require identifying a small set of covariates that are associated with a target response and estimating their effects. Often, these effects are nonlinear and include interactions, so linear and additive methods can lead to poor estimation and variable selection. Unfortunately, methods that simultaneously express sparsity, nonlinearity, and interactions are computationally intractable -- with runtime at least quadratic in the number of covariates, and often worse. In the present work, we solve this computational bottleneck. We show that suitable interaction models have a kernel representation, namely there exists a "kernel trick" to perform variable selection and estimation in $O$(# covariates) time. Our resulting fit corresponds to a sparse orthogonal decomposition of the regression function in a Hilbert space (i.e., a functional ANOVA decomposition), where interaction effects represent all variation that cannot be explained by lower-order effects. On a variety of synthetic and real data sets, our approach outperforms existing methods used for large, high-dimensional data sets while remaining competitive (or being orders of magnitude faster) in runtime.

translated by 谷歌翻译

Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond

Nathan Kallus , Xiaojie Mao , Masatoshi Uehara

分类： (统计)机器学习 | 机器学习

2019-12-30

我们考虑在估计涉及依赖参数的高维滋扰的估计方程中估计一个低维参数。一个中心示例是因果推理中（局部）分位数处理效应（（L）QTE）的有效估计方程，涉及在分位数以估计的分位数评估的协方差累积分布函数。借记机学习（DML）是一种使用灵活的机器学习方法估算高维滋扰的数据分解方法，但是将其应用于参数依赖性滋扰的问题是不切实际的。对于（L）QTE，DML要求我们学习整个协变量累积分布函数。相反，我们提出了局部偏见的机器学习（LDML），该学习避免了这一繁重的步骤，并且只需要对参数进行一次初始粗糙猜测而估算烦恼。对于（L）QTE，LDML仅涉及学习两个回归功能，这是机器学习方法的标准任务。我们证明，在松弛速率条件下，我们的估计量与使用未知的真实滋扰的不可行的估计器具有相同的有利渐近行为。因此，LDML值得注意的是，当我们必须控制许多协变量和/或灵活的关系时，如（l）QTES在（（l）QTES）中，实际上可以有效地估算重要数量，例如（l）QTES。

translated by 谷歌翻译

Invariant Policy Learning: A Causal Perspective

Sorawit Saengkyongam , Nikolaj Thams , Jonas Peters , Niklas Pfister

分类：机器学习 | 人工智能 | (统计)机器学习

2021-06-01

上下文的强盗和强化学习算法已成功用于各种交互式学习系统，例如在线广告，推荐系统和动态定价。但是，在高风险应用领域（例如医疗保健）中，它们尚未被广泛采用。原因之一可能是现有方法假定基本机制是静态的，因为它们不会在不同的环境上改变。但是，在许多现实世界中，这些机制可能会跨环境变化，这可能使静态环境假设无效。在本文中，考虑到离线上下文匪徒的框架，我们迈出了解决环境转变问题的一步。我们认为环境转移问题通过因果关系的角度，并提出了多种环境的背景匪徒，从而可以改变基本机制。我们采用因果关系文献的不变性概念，并介绍了政策不变性的概念。我们认为，仅当存在未观察到的变量时，政策不变性才有意义，并表明在这种情况下，保证在适当假设下跨环境概括最佳不变政策。我们的结果建立了因果关系，不变性和上下文土匪之间的具体联系。

translated by 谷歌翻译

Treatment Effect Estimation from Observational Network Data using Augmented Inverse Probability Weighting and Machine Learning

Corinne Emmenegger , Meta-Lina Spohn , Peter Bühlmann

分类： (统计)机器学习

2022-06-29

治疗效应估计的因果推理方法通常假设独立的实验单位。但是，由于实验单元可能会相互作用，因此这种假设通常值得怀疑。我们开发了增强的反可能性加权（AIPW），以估计和推断因果治疗对依赖观察数据的影响。我们的框架涵盖了网络中相互作用的单位引起的溢出效应的非常普遍的案例。我们使用插件机学习来估计无限维的滋扰成分，导致一致的治疗效应估计器以参数速率收敛，渐近地遵循高斯分布。

translated by 谷歌翻译

DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R

Philipp Bach , Victor Chernozhukov , Malte S. Kurz , Martin Spindler

分类： (统计)机器学习 | 机器学习

2021-03-17

R包Doubleml实现了Chernozhukov等人的双重/辩护机器学习框架。（2018）。它提供了基于机器学习方法的因果模型中估计参数的功能。双机器学习框架由三个关键成分组成：Neyman正交性，高质量的机器学习估计和样品拆分。可以通过MLR3生态系统中可用的各种最新机器学习方法来执行滋扰组件的估计。 Doubleml使得可以在各种因果模型中进行推断，包括部分线性和交互式回归模型及其扩展到仪器变量估计。 Doubleml的面向对象的实现为模型规范具有很高的灵活性，并使其易于扩展。本文是对双机器学习框架和R软件包DOUBLEML的介绍。在具有模拟和真实数据集的可再现代码示例中，我们演示了Doubleml用户如何基于机器学习方法执行有效的推断。

translated by 谷歌翻译

Reframed GES with a Neural Conditional Dependence Measure

Xinwei Shen , Shengyu Zhu , Jiji Zhang , Shoubo Hu , Zhitang Chen

分类： (统计)机器学习 | 机器学习

2022-06-17

在非参数环境中，因果结构通常仅在马尔可夫等效性上可识别，并且出于因果推断的目的，学习马尔可夫等效类（MEC）的图形表示很有用。在本文中，我们重新审视了贪婪的等效搜索（GES）算法，该算法被广泛引用为一种基于分数的算法，用于学习基本因果结构的MEC。我们观察到，为了使GES算法在非参数设置中保持一致，不必设计评估图的评分度量。取而代之的是，足以插入有条件依赖度量的一致估计器来指导搜索。因此，我们提出了GES算法的重塑，该算法比基于标准分数的版本更灵活，并且很容易将自己带到非参数设置，并具有条件依赖性的一般度量。此外，我们提出了一种神经条件依赖性（NCD）度量，该措施利用深神经网络的表达能力以非参数方式表征条件独立性。我们根据标准假设建立了重新构架GES算法的最佳性，并使用我们的NCD估计器来决定条件独立性的一致性。这些结果共同证明了拟议的方法。实验结果证明了我们方法在因果发现中的有效性，以及使用我们的NCD度量而不是基于内核的措施的优势。

translated by 谷歌翻译

Causal Structure Learning: a Combinatorial Perspective

Chandler Squires , Caroline Uhler

分类：机器学习

2022-06-02

In this review, we discuss approaches for learning causal structure from data, also called causal discovery. In particular, we focus on approaches for learning directed acyclic graphs (DAGs) and various generalizations which allow for some variables to be unobserved in the available data. We devote special attention to two fundamental combinatorial aspects of causal structure learning. First, we discuss the structure of the search space over causal graphs. Second, we discuss the structure of equivalence classes over causal graphs, i.e., sets of graphs which represent what can be learned from observational data alone, and how these equivalence classes can be refined by adding interventional data.

translated by 谷歌翻译

Active Invariant Causal Prediction: Experiment Selection through Stability

Juan L. Gamella , Christina Heinze-Deml

分类： (统计)机器学习

2020-06-10

因果学习的基本难度是通常不能根据观察数据完全识别因果模型。介入数据，即源自不同实验环境的数据，提高了可识别性。然而，改善统治性取决于每个实验中的干预措施的目标和性质。由于在实际应用实验往往是昂贵的，因此需要执行正确的干预措施，使得尽可能少。在这项工作中，我们提出了一种基于不变因果预测（ICP）的新的主动学习（即实验选择）框架（A-ICP）（Peters等，2016）。对于一般结构因果模型，我们的表征干预对所谓的稳定集的影响，由（Pfister等，2019）引入的概念。我们利用这些结果提出了用于A-ICP的几个干预选择策略，该策略快速揭示了因果图中响应变量的直接原因，同时保持ICP中固有的错误控制。经验上，我们分析了拟议的拟议政策在人口和有限政府实验中的表现。

translated by 谷歌翻译

Exploiting Independent Instruments: Identification and Distribution Generalization

Sorawit Saengkyongam , Leonard Henckel , Niklas Pfister , Jonas Peters

分类： (统计)机器学习 | 机器学习

2022-02-03

仪器变量模型使我们能够确定协变量$ x $和响应$ y $之间的因果功能，即使在存在未观察到的混淆的情况下。大多数现有估计器都假定响应$ y $和隐藏混杂因素中的错误项与仪器$ z $不相关。这通常是由图形分离的动机，这一论点也证明了独立性。但是，提出独立限制会导致严格的可识别性结果。我们连接到计量经济学的现有文献，并提供了一种称为HSIC-X的实用方法，用于利用独立性，可以与任何基于梯度的学习程序结合使用。我们看到，即使在可识别的设置中，考虑到更高的矩可能会产生更好的有限样本结果。此外，我们利用独立性进行分布泛化。我们证明，只要这些移位足够强，拟议的估计器对于仪器的分布变化和最佳案例最佳变化是不变的。这些结果即使在未识别的情况下也能够得出这些结果，即仪器不足以识别因果功能。

translated by 谷歌翻译

The Dual PC Algorithm for Structure Learning

Enrico Giudice , Jack Kuipers , Giusi Moffa

分类： (统计)机器学习 | 机器学习

2021-12-16

在学习从观察数据中学习贝叶斯网络的图形结构是描述和帮助了解复杂应用程序中的数据生成过程的关键，而任务由于其计算复杂性而构成了相当大的挑战。代表贝叶斯网络模型的定向非循环图（DAG）通常不会从观察数据识别，并且存在各种方法来估计其等价类。在某些假设下，流行的PC算法可以通过测试条件独立（CI）一致地始终恢复正确的等价类，从边际独立关系开始，逐步扩展调节集。这里，我们提出了一种通过利用协方差与精密矩阵之间的反向关系来执行PC算法内的CI测试的新颖方案。值得注意的是，精密矩阵的元素与高斯数据的部分相关性。然后，我们的算法利用对协方差和精密矩阵的块矩阵逆转，同时对互补（或双）调节集的部分相关性进行测试。因此，双PC算法的多个CI测试首先考虑边缘和全阶CI关系并逐步地移动到中心顺序。仿真研究表明，双PC算法在运行时和恢复底层网络结构方面都优于经典PC算法。

translated by 谷歌翻译