Causal phenomena associated with rare events frequently occur across a wide range of engineering and mathematical problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links between random variables that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel algorithm that performs statistical independence tests on data collected from time-invariant dynamical systems in which rare but consequential events occur. We seek to understand if the state of the dynamical system causally affects the likelihood of the rare event. In particular, we exploit the time-invariance of the underlying data to superimpose the occurrences of rare events, thus creating a new dataset, with rare events are better represented, on which conditional independence tests can be more efficiently performed. We provide non-asymptotic bounds for the consistency of our algorithm, and validate the performance of our algorithm across various simulated scenarios, with applications to traffic accidents.
translated by 谷歌翻译
数学模型是动态控制系统设计中的基本构件。随着控制系统变得越来越复杂和网络,基于第一原理的方法达到了限制。数据驱动的方法提供了替代方案。但是,在没有结构知识的情况下,这些方法很容易在训练数据中找到虚假的相关性,这可能会妨碍所获得的模型的概括能力。当系统暴露于未知情况时,这可以显着降低控制和预测性能。先前的因果鉴定可以防止这种陷阱。在本文中,我们提出了一种识别控制系统因果结构的方法。我们根据可控性概念设计实验,该概念提供了一种系统的方法来计算输入轨迹,该输入轨迹将系统引导到其状态空间中的特定区域。然后,我们分析从因果推理中利用强大技术的结果数据,并将其扩展到控制系统。此外,我们得出了保证发现系统真正因果结构的条件。在机器人臂上的实验表明,来自现实世界数据和增强的概括能力的可靠因果鉴定。
translated by 谷歌翻译
A common approach to modeling networks assigns each node to a position on a low-dimensional manifold where distance is inversely proportional to connection likelihood. More positive manifold curvature encourages more and tighter communities; negative curvature induces repulsion. We consistently estimate manifold type, dimension, and curvature from simply connected, complete Riemannian manifolds of constant curvature. We represent the graph as a noisy distance matrix based on the ties between cliques, then develop hypothesis tests to determine whether the observed distances could plausibly be embedded isometrically in each of the candidate geometries. We apply our approach to data-sets from economics and neuroscience.
translated by 谷歌翻译
Network-based analyses of dynamical systems have become increasingly popular in climate science. Here we address network construction from a statistical perspective and highlight the often ignored fact that the calculated correlation values are only empirical estimates. To measure spurious behaviour as deviation from a ground truth network, we simulate time-dependent isotropic random fields on the sphere and apply common network construction techniques. We find several ways in which the uncertainty stemming from the estimation procedure has major impact on network characteristics. When the data has locally coherent correlation structure, spurious link bundle teleconnections and spurious high-degree clusters have to be expected. Anisotropic estimation variance can also induce severe biases into empirical networks. We validate our findings with ERA5 reanalysis data. Moreover we explain why commonly applied resampling procedures are inappropriate for significance evaluation and propose a statistically more meaningful ensemble construction framework. By communicating which difficulties arise in estimation from scarce data and by presenting which design decisions increase robustness, we hope to contribute to more reliable climate network construction in the future.
translated by 谷歌翻译
因果和归因研究对于地球科学发现至关重要,对于为气候,生态和水政策提供信息至关重要。但是,当前的方法需要与科学和利益相关者挑战的复杂性以及数据可用性以及数据驱动方法的充分性相结合。除非通过物理学进行仔细的通知,否则它们会冒着将相关性与因果关系相关或因估计不准确而淹没的风险。鉴于自然实验,对照试验,干预措施和反事实检查通常是不切实际的,因此已经开发了信息理论方法,并在地球科学中不断完善。在这里,我们表明,基于转移熵的因果图最近在具有备受瞩目的发现的地球科学中变得流行,即使增强具有统计学意义,也可能是虚假的。我们开发了一种基于子样本的合奏方法,用于鲁棒性因果分析。模拟数据以及气候和生态水文中的观察表明,这种方法的鲁棒性和一致性。
translated by 谷歌翻译
Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards understanding how these methods behave in high-dimensional settings, where $d$ and $n$ both increase to infinity together. This often leads to different inference procedures, depending on the assumptions about the dimensionality, leaving the practitioner in a bind: given a dataset with 100 samples in 20 dimensions, should they calibrate by assuming $n \gg d$, or $d/n \approx 0.2$? This paper considers the goal of dimension-agnostic inference; developing methods whose validity does not depend on any assumption on $d$ versus $n$. We introduce an approach that uses variational representations of existing test statistics along with sample splitting and self-normalization to produce a new test statistic with a Gaussian limiting distribution, regardless of how $d$ scales with $n$. The resulting statistic can be viewed as a careful modification of degenerate U-statistics, dropping diagonal blocks and retaining off-diagonal blocks. We exemplify our technique for some classical problems including one-sample mean and covariance testing, and show that our tests have minimax rate-optimal power against appropriate local alternatives. In most settings, our cross U-statistic matches the high-dimensional power of the corresponding (degenerate) U-statistic up to a $\sqrt{2}$ factor.
translated by 谷歌翻译
我们提出了一种基于最大平均差异(MMD)的新型非参数两样本测试,该测试是通过具有不同核带宽的聚合测试来构建的。这种称为MMDAGG的聚合过程可确保对所使用的内核的收集最大化测试能力,而无需持有核心选择的数据(这会导致测试能力损失)或任意内核选择,例如中位数启发式。我们在非反应框架中工作,并证明我们的聚集测试对Sobolev球具有最小自适应性。我们的保证不仅限于特定的内核,而是符合绝对可集成的一维翻译不变特性内核的任何产品。此外,我们的结果适用于流行的数值程序来确定测试阈值,即排列和野生引导程序。通过对合成数据集和现实世界数据集的数值实验,我们证明了MMDAGG优于MMD内核适应的替代方法,用于两样本测试。
translated by 谷歌翻译
我们在右审查的生存时间和协变量之间介绍一般的非参数独立测试,这可能是多变量的。我们的测试统计数据具有双重解释,首先是潜在无限的重量索引日志秩检验的超级索引,具有属于函数的再现内核HILBERT空间(RKHS)的重量函数;其次,作为某些有限措施的嵌入差异的规范,与Hilbert-Schmidt独立性标准(HSIC)测试统计类似。我们研究了测试的渐近性质,找到了足够的条件,以确保我们的测试在任何替代方案下正确拒绝零假设。可以直截了当地计算测试统计,并且通过渐近总体的野外自注程序进行拒绝阈值。对模拟和实际数据的广泛调查表明,我们的测试程序通常比检测复杂的非线性依赖的竞争方法更好。
translated by 谷歌翻译
Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for $X$ is non-zero. However, when the model is misspecified, the test may have poor power, for example when $X$ is involved in complex interactions, or lead to many false rejections. In this work we study the problem of testing the model-free null of conditional mean independence, i.e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$. We propose a simple and general framework that can leverage flexible nonparametric or machine learning methods, such as additive models or random forests, to yield both robust error control and high power. The procedure involves using these methods to perform regressions, first to estimate a form of projection of $Y$ on $X$ and $Z$ using one half of the data, and then to estimate the expected conditional covariance between this projection and $Y$ on the remaining half of the data. While the approach is general, we show that a version of our procedure using spline regression achieves what we show is the minimax optimal rate in this nonparametric testing problem. Numerical experiments demonstrate the effectiveness of our approach both in terms of maintaining Type I error control, and power, compared to several existing approaches.
translated by 谷歌翻译
我们研究了小组测试问题,其目标是根据合并测试的结果,确定一组k感染的人,这些k含有稀有疾病,这些人在经过测试中至少有一个受感染的个体时返回阳性的结果。团体。我们考虑将个人分配给测试的两个不同的简单随机过程:恒定柱设计和伯努利设计。我们的第一组结果涉及基本统计限制。对于恒定柱设计,我们给出了一个新的信息理论下限,这意味着正确识别的感染者的比例在测试数量越过特定阈值时会经历急剧的“全或全或无所不包”的相变。对于Bernoulli设计,我们确定解决相关检测问题所需的确切测试数量(目的是区分小组测试实例和纯噪声),改善Truong,Aldridge和Scarlett的上限和下限(2020)。对于两个小组测试模型,我们还研究了计算有效(多项式时间)推理程序的能力。我们确定了解决检测问题的低度多项式算法所需的精确测试数量。这为在少量稀疏度的检测和恢复问题中都存在固有的计算统计差距提供了证据。值得注意的是,我们的证据与Iliopoulos和Zadik(2021)相反,后者预测了Bernoulli设计中没有计算统计差距。
translated by 谷歌翻译
随着混凝剂的数量增加,因果推理越来越复杂。给定护理$ x $,混淆器$ z $和结果$ y $,我们开发一个非参数方法来测试\ texit {do-null}假设$ h_0:\; p(y | \ text {\它do}(x = x))= p(y)$违反替代方案。在Hilbert Schmidt独立性标准(HSIC)上进行边缘独立性测试,我们提出了后门 - HSIC(BD-HSIC)并证明它被校准,并且在大量混淆下具有二元和连续治疗的力量。此外,我们建立了BD-HSIC中使用的协方差运算符的估计的收敛性质。我们研究了BD-HSIC对参数测试的优点和缺点以及与边缘独立测试或有条件独立测试相比使用DO-NULL测试的重要性。可以在\超链接{https:/github.com/mrhuff/kgformula} {\ texttt {https://github.com/mrhuff/kgformula}}完整的实现。
translated by 谷歌翻译
稀有事件仿真技术,如重要采样(是),构成强大的工具,以加速罕见灾难性事件的具有挑战性的估算。这些技术经常利用底层系统结构的知识和分析,以赋予赋予理想的效率保证。然而,黑匣子问题,特别是来自最近AI驱动的物理系统的安全关键型应用的问题,可以从根本上破坏他们的效率担保,并在没有诊断地检测的情况下导致危险的估计。我们提出了一个框架,称为深度概率加速评估(Deep-Prae)来设计统计保障是通过转换多功能的黑匣子采样器,但可能缺乏保证,以便我们称之为放松的效率证明,允许准确估计界限。论罕见事件概率。我们介绍了深度PRAE理论,将主导点概念与稀有事件集合通过深度神经网络分类器进行了学习,并证明了其在数值例子中的有效性,包括智能驾驶算法的安全测试。
translated by 谷歌翻译
我们提出了对学度校正随机块模型(DCSBM)的合适性测试。该测试基于调整后的卡方统计量,用于测量$ n $多项式分布的组之间的平等性,该分布具有$ d_1,\ dots,d_n $观测值。在网络模型的背景下,多项式的数量($ n $)的数量比观测值数量($ d_i $)快得多,与节点$ i $的度相对应,因此设置偏离了经典的渐近学。我们表明,只要$ \ {d_i \} $的谐波平均值生长到无穷大,就可以使统计量在NULL下分配。顺序应用时,该测试也可以用于确定社区数量。该测试在邻接矩阵的压缩版本上进行操作,因此在学位上有条件,因此对大型稀疏网络具有高度可扩展性。我们结合了一个新颖的想法,即在测试$ K $社区时根据$(k+1)$ - 社区分配来压缩行。这种方法在不牺牲计算效率的情况下增加了顺序应用中的力量,我们证明了它在恢复社区数量方面的一致性。由于测试统计量不依赖于特定的替代方案,因此其效用超出了顺序测试,可用于同时测试DCSBM家族以外的各种替代方案。特别是,我们证明该测试与具有社区结构的潜在可变性网络模型的一般家庭一致。
translated by 谷歌翻译
We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distributionfree tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests.
translated by 谷歌翻译
我们介绍了学习然后测试,校准机器学习模型的框架,使其预测满足明确的,有限样本统计保证,无论底层模型如何和(未知)数据生成分布。框架地址,以及在其他示例中,在多标签分类中的错误发现速率控制,在实例分割中交叉联盟控制,以及同时控制分类或回归中的异常检测和置信度覆盖的类型误差。为实现这一目标,我们解决了一个关键的技术挑战:控制不一定单调的任意风险。我们的主要洞察力是将风险控制问题重新构建为多个假设检测,使技术和数学论据不同于先前文献中的技术。我们使用我们的框架为多个核心机器学习任务提供新的校准方法,在计算机视觉中具有详细的工作示例。
translated by 谷歌翻译
收购数据是机器学习的许多应用中的一项艰巨任务,只有一个人希望并且预期人口风险在单调上汇率增加(更好的性能)。事实证明,甚至对于最小化经验风险的最大限度的算法,甚至不令人惊讶的情况。在训练中的风险和不稳定的非单调行为表现出并出现在双重血统描述中的流行深度学习范式中。这些问题突出了目前对学习算法和泛化的理解缺乏了解。因此,追求这种行为的表征是至关重要的,这是至关重要的。在本文中,我们在弱假设下获得了一致和风险的单调算法,从而解决了一个打开问题Viering等。 2019关于如何避免风险曲线的非单调行为。我们进一步表明,风险单调性不一定以更糟糕的风险率的价格出现。为实现这一目标,我们推出了持有某些非I.I.D的独立利益的新经验伯恩斯坦的浓度不等式。鞅差异序列等进程。
translated by 谷歌翻译
在制定政策指南时,随机对照试验(RCT)代表了黄金标准。但是,RCT通常是狭窄的,并且缺乏更广泛的感兴趣人群的数据。这些人群中的因果效应通常是使用观察数据集估算的,这可能会遭受未观察到的混杂和选择偏见。考虑到一组观察估计(例如,来自多项研究),我们提出了一个试图拒绝偏见的观察性估计值的元偏值。我们使用验证效应,可以从RCT和观察数据中推断出的因果效应。在拒绝未通过此测试的估计器之后,我们对RCT中未观察到的亚组的外推性效应产生了保守的置信区间。假设至少一个观察估计量在验证和外推效果方面是渐近正常且一致的,我们为我们算法输出的间隔的覆盖率概率提供了保证。为了促进在跨数据集的因果效应运输的设置中,我们给出的条件下,即使使用灵活的机器学习方法用于估计滋扰参数,群体平均治疗效应的双重稳定估计值也是渐近的正常。我们说明了方法在半合成和现实世界数据集上的特性,并表明它与标准的荟萃分析技术相比。
translated by 谷歌翻译
我们建议在线变更点检测的新过程。我们的方法扩展了一个想法,即在变更前和变换后分布之间最大化差异度量。这将导致一个适合参数和非参数场景的灵活过程。我们证明了程序的平均运行长度及其预期的检测延迟。通过关于合成和现实世界数据集的数值实验来说明该算法的效率。
translated by 谷歌翻译
最大信息系数(MIC)是一个强大的统计量,可以识别变量之间的依赖性。但是,它可以应用于敏感数据,并且发布可能会泄漏私人信息。作为解决方案,我们提出算法以提供差异隐私的方式近似麦克风。我们表明,经典拉普拉斯机制的自然应用产生的精度不足。因此,我们介绍了MICT统计量,这是一种新的MIC近似值,与差异隐私更加兼容。我们证明MICS是麦克风的一致估计器,我们提供了两个差异性私有版本。我们对各种真实和合成数据集进行实验。结果表明,私人微统计数据极大地超过了拉普拉斯机制的直接应用。此外,对现实世界数据集的实验显示出准确性,当样本量至少适中时可用。
translated by 谷歌翻译
我们提出了置信度序列 - 置信区间序列,其均匀地随时间均匀 - 用于基于I.I.D的流的完整,完全有序集中的任何分布的量级。观察。我们提供用于跟踪固定定量的方法并同时跟踪所有定量。具体而言,我们提供具有小常数的明确表达式,其宽度以尽可能快的$ \ SQRT {t} \ log \ log t} $率,以及实证分布函数的非渐近浓度不等式以相同的速率均匀地持续持续。后者加强了Smirnov迭代对数的实证过程法,延长了DVORETZKY-KIEFER-WOLFOITZ不等式以均匀地保持一段时间。我们提供了一种新的算法和样本复杂性,用于在多武装强盗框架中选择具有大约最佳定量的臂。在仿真中,我们的方法需要比现有方法更少五到五十的样品。
translated by 谷歌翻译