我们研究了基于内核Stein差异(KSD)的合适性测试的特性。我们介绍了一种构建一个名为KSDAGG的测试的策略,该测试与不同的核聚集了多个测试。 KSDAGG避免将数据分开以执行内核选择(这会导致测试能力损失),并最大程度地提高了核集合的测试功率。我们提供有关KSDAGG的力量的理论保证:我们证明它达到了收集最小的分离率,直到对数期限。可以在实践中准确计算KSDAGG,因为它依赖于参数bootstrap或野生引导程序来估计分位数和级别校正。特别是,对于固定核的带宽至关重要的选择,它避免了诉诸于任意启发式方法(例如中位数或标准偏差)或数据拆分。我们在合成数据和现实世界中发现KSDAGG优于其他基于自适应KSD的拟合优度测试程序。
translated by 谷歌翻译
我们使用最大平均差异(MMD),Hilbert Schmidt独立标准(HSIC)和内核Stein差异(KSD),,提出了一系列针对两样本,独立性和合适性问题的计算效率,非参数测试,用于两样本,独立性和合适性问题。分别。我们的测试统计数据是不完整的$ u $统计信息,其计算成本与与经典$ u $ u $统计测试相关的样本数量和二次时间之间的线性时间之间的插值。这三个提出的测试在几个内核带宽上汇总,以检测各种尺度的零件:我们称之为结果测试mmdagginc,hsicagginc和ksdagginc。对于测试阈值,我们得出了一个针对野生引导不完整的$ U $ - 统计数据的分位数,该统计是独立的。我们得出了MMDagginc和Hsicagginc的均匀分离率,并准确量化了计算效率和可实现速率之间的权衡:据我们所知,该结果是基于不完整的$ U $统计学的测试新颖的。我们进一步表明,在二次时间案例中,野生引导程序不会对基于更广泛的基于置换的方法进行测试功率,因为​​两者都达到了相同的最小最佳速率(这反过来又与使用Oracle分位数的速率相匹配)。我们通过数值实验对计算效率和测试能力之间的权衡进行数字实验来支持我们的主张。在三个测试框架中,我们观察到我们提出的线性时间聚合测试获得的功率高于当前最新线性时间内核测试。
translated by 谷歌翻译
我们提出了一种基于最大平均差异(MMD)的新型非参数两样本测试,该测试是通过具有不同核带宽的聚合测试来构建的。这种称为MMDAGG的聚合过程可确保对所使用的内核的收集最大化测试能力,而无需持有核心选择的数据(这会导致测试能力损失)或任意内核选择,例如中位数启发式。我们在非反应框架中工作,并证明我们的聚集测试对Sobolev球具有最小自适应性。我们的保证不仅限于特定的内核,而是符合绝对可集成的一维翻译不变特性内核的任何产品。此外,我们的结果适用于流行的数值程序来确定测试阈值,即排列和野生引导程序。通过对合成数据集和现实世界数据集的数值实验,我们证明了MMDAGG优于MMD内核适应的替代方法,用于两样本测试。
translated by 谷歌翻译
Over the last decade, an approach that has gained a lot of popularity to tackle non-parametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show that the popular MMD (maximum mean discrepancy) two-sample test is not optimal in terms of the separation boundary measured in Hellinger distance. Second, we propose a modification to the MMD test based on spectral regularization by taking into account the covariance information (which is not captured by the MMD test) and prove the proposed test to be minimax optimal with a smaller separation boundary than that achieved by the MMD test. Third, we propose an adaptive version of the above test which involves a data-driven strategy to choose the regularization parameter and show the adaptive test to be almost minimax optimal up to a logarithmic factor. Moreover, our results hold for the permutation variant of the test where the test threshold is chosen elegantly through the permutation of the samples. Through numerical experiments on synthetic and real-world data, we demonstrate the superior performance of the proposed test in comparison to the MMD test.
translated by 谷歌翻译
Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards understanding how these methods behave in high-dimensional settings, where $d$ and $n$ both increase to infinity together. This often leads to different inference procedures, depending on the assumptions about the dimensionality, leaving the practitioner in a bind: given a dataset with 100 samples in 20 dimensions, should they calibrate by assuming $n \gg d$, or $d/n \approx 0.2$? This paper considers the goal of dimension-agnostic inference; developing methods whose validity does not depend on any assumption on $d$ versus $n$. We introduce an approach that uses variational representations of existing test statistics along with sample splitting and self-normalization to produce a new test statistic with a Gaussian limiting distribution, regardless of how $d$ scales with $n$. The resulting statistic can be viewed as a careful modification of degenerate U-statistics, dropping diagonal blocks and retaining off-diagonal blocks. We exemplify our technique for some classical problems including one-sample mean and covariance testing, and show that our tests have minimax rate-optimal power against appropriate local alternatives. In most settings, our cross U-statistic matches the high-dimensional power of the corresponding (degenerate) U-statistic up to a $\sqrt{2}$ factor.
translated by 谷歌翻译
The kernel Maximum Mean Discrepancy~(MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus it has an intractable limiting distribution. Hence, to design a level-$\alpha$ test, one usually selects the rejection threshold as the $(1-\alpha)$-quantile of the permutation distribution. The resulting nonparametric test has finite-sample validity but suffers from large computational cost, since every permutation takes quadratic time. We propose the cross-MMD, a new quadratic-time MMD test statistic based on sample-splitting and studentization. We prove that under mild assumptions, the cross-MMD has a limiting standard Gaussian distribution under the null. Importantly, we also show that the resulting test is consistent against any fixed alternative, and when using the Gaussian kernel, it has minimax rate-optimal power against local alternatives. For large sample sizes, our new cross-MMD provides a significant speedup over the MMD, for only a slight loss in power.
translated by 谷歌翻译
We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distributionfree tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests.
translated by 谷歌翻译
Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for $X$ is non-zero. However, when the model is misspecified, the test may have poor power, for example when $X$ is involved in complex interactions, or lead to many false rejections. In this work we study the problem of testing the model-free null of conditional mean independence, i.e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$. We propose a simple and general framework that can leverage flexible nonparametric or machine learning methods, such as additive models or random forests, to yield both robust error control and high power. The procedure involves using these methods to perform regressions, first to estimate a form of projection of $Y$ on $X$ and $Z$ using one half of the data, and then to estimate the expected conditional covariance between this projection and $Y$ on the remaining half of the data. While the approach is general, we show that a version of our procedure using spline regression achieves what we show is the minimax optimal rate in this nonparametric testing problem. Numerical experiments demonstrate the effectiveness of our approach both in terms of maintaining Type I error control, and power, compared to several existing approaches.
translated by 谷歌翻译
我们在右审查的生存时间和协变量之间介绍一般的非参数独立测试,这可能是多变量的。我们的测试统计数据具有双重解释,首先是潜在无限的重量索引日志秩检验的超级索引,具有属于函数的再现内核HILBERT空间(RKHS)的重量函数;其次,作为某些有限措施的嵌入差异的规范,与Hilbert-Schmidt独立性标准(HSIC)测试统计类似。我们研究了测试的渐近性质,找到了足够的条件,以确保我们的测试在任何替代方案下正确拒绝零假设。可以直截了当地计算测试统计,并且通过渐近总体的野外自注程序进行拒绝阈值。对模拟和实际数据的广泛调查表明,我们的测试程序通常比检测复杂的非线性依赖的竞争方法更好。
translated by 谷歌翻译
In nonparametric independence testing, we observe i.i.d.\ data $\{(X_i,Y_i)\}_{i=1}^n$, where $X \in \mathcal{X}, Y \in \mathcal{Y}$ lie in any general spaces, and we wish to test the null that $X$ is independent of $Y$. Modern test statistics such as the kernel Hilbert-Schmidt Independence Criterion (HSIC) and Distance Covariance (dCov) have intractable null distributions due to the degeneracy of the underlying U-statistics. Thus, in practice, one often resorts to using permutation testing, which provides a nonasymptotic guarantee at the expense of recalculating the quadratic-time statistics (say) a few hundred times. This paper provides a simple but nontrivial modification of HSIC and dCov (called xHSIC and xdCov, pronounced ``cross'' HSIC/dCov) so that they have a limiting Gaussian distribution under the null, and thus do not require permutations. This requires building on the newly developed theory of cross U-statistics by Kim and Ramdas (2020), and in particular developing several nontrivial extensions of the theory in Shekhar et al. (2022), which developed an analogous permutation-free kernel two-sample test. We show that our new tests, like the originals, are consistent against fixed alternatives, and minimax rate optimal against smooth local alternatives. Numerical simulations demonstrate that compared to the full dCov or HSIC, our variants have the same power up to a $\sqrt 2$ factor, giving practitioners a new option for large problems or data-analysis pipelines where computation, not sample size, could be the bottleneck.
translated by 谷歌翻译
内核平均值嵌入是一种强大的工具,可以代表任意空间上的概率分布作为希尔伯特空间中的单个点。然而,计算和存储此类嵌入的成本禁止其在大规模设置中的直接使用。我们提出了一个基于NyStr \“ Om方法的有效近似过程,该过程利用了数据集的一个小随机子集。我们的主要结果是该过程的近似误差的上限。它在子样本大小上产生足够的条件以获得足够的条件。降低计算成本的同时,标准的$ n^{ - 1/2} $。我们讨论了此结果的应用,以近似的最大平均差异和正交规则,并通过数值实验说明了我们的理论发现。
translated by 谷歌翻译
我们提出了对非参数仪器变量(NPIV)模型中的结构函数的多面体锥体(例如,单调性,凸起)和平等(例如,参数,半游戏)限制的新的自适应假设试验。我们的测试统计是基于受限制和不受限制的筛估计之间的二次距离的改进的休假样本模拟。我们提供筛选调整参数的计算简单,数据驱动的选择和调整的CHI平方临界值。我们的测试在未知的内能性和仪器的未知强度存在下适应替代功能的未知平滑度。它达到了$ ^ 2 $以$ ^ 2 $的试验率。也就是说,通过未知规则的NPIV模型的任何其他假设测试,不能改善其在复合空缺上均匀地均匀地均匀的I型错误及其类型的II误差。通过反转自适应测试,可以获得数据驱动的置信度量为$ ^ 2 $。模拟确认我们的自适应测试控制规模及其有限样本功率极大地超过了NPIV模型中的单调性和参数限制的现有非自适应测试。介绍了对差异化产品需求和Engel曲线进行形状限制的经验应用。
translated by 谷歌翻译
We develop an online kernel Cumulative Sum (CUSUM) procedure, which consists of a parallel set of kernel statistics with different window sizes to account for the unknown change-point location. Compared with many existing sliding window-based kernel change-point detection procedures, which correspond to the Shewhart chart-type procedure, the proposed procedure is more sensitive to small changes. We further present a recursive computation of detection statistics, which is crucial for online procedures to achieve a constant computational and memory complexity, such that we do not need to calculate and remember the entire Gram matrix, which can be a computational bottleneck otherwise. We obtain precise analytic approximations of the two fundamental performance metrics, the Average Run Length (ARL) and Expected Detection Delay (EDD). Furthermore, we establish the optimal window size on the order of $\log ({\rm ARL})$ such that there is nearly no power loss compared with an oracle procedure, which is analogous to the classic result for window-limited Generalized Likelihood Ratio (GLR) procedure. We present extensive numerical experiments to validate our theoretical results and the competitive performance of the proposed method.
translated by 谷歌翻译
尽管U统计量在现代概率和统计学中存在着无处不在的,但其在依赖框架中的非反应分析可能被忽略了。在最近的一项工作中,已经证明了对统一的马尔可夫链的U级统计数据的新浓度不平等。在本文中,我们通过在三个不同的研究领域中进一步推动了当前知识状态,将这一理论突破付诸实践。首先,我们为使用MCMC方法估算痕量类积分运算符光谱的新指数不平等。新颖的是,这种结果适用于具有正征和负征值的内核,据我们所知,这是新的。此外,我们研究了使用成对损失函数和马尔可夫链样品的在线算法的概括性能。我们通过展示如何从任何在线学习者产生的假设序列中提取低风险假设来提供在线到批量转换结果。我们最终对马尔可夫链的不变度度量的密度进行了拟合优度测试的非反应分析。我们确定了一些类别的替代方案,基于$ L_2 $距离的测试具有规定的功率。
translated by 谷歌翻译
学习将模型分布与观察到的数据区分开来是统计和机器学习中的一个基本问题,而高维数据仍然是这些问题的挑战性环境。量化概率分布差异的指标(例如Stein差异)在高维度的统计测试中起重要作用。在本文中,我们考虑了一个希望区分未知概率分布和名义模型分布的数据的设置。虽然最近的研究表明,最佳$ l^2 $ regularized Stein评论家等于两个概率分布的分数函数的差异,最多是乘法常数,但我们研究了$ l^2 $正则化的作用,训练神经网络时差异评论家功能。由训练神经网络的神经切线内核理论的激励,我们开发了一种新的分期程序,用于训练时间的正则化重量。这利用了早期培训的优势,同时还可以延迟过度拟合。从理论上讲,我们将训练动态与大的正则重量与在早期培训时间的“懒惰训练”制度的内核回归优化相关联。在模拟的高维分布漂移数据和评估图像数据的生成模型的应用中,证明了分期$ l^2 $正则化的好处。
translated by 谷歌翻译
概率分布之间的差异措施,通常被称为统计距离,在概率理论,统计和机器学习中普遍存在。为了在估计这些距离的距离时,对维度的诅咒,最近的工作已经提出了通过带有高斯内核的卷积在测量的分布中平滑局部不规则性。通过该框架的可扩展性至高维度,我们研究了高斯平滑$ P $ -wassersein距离$ \ mathsf {w} _p ^ {(\ sigma)} $的结构和统计行为,用于任意$ p \ GEQ 1 $。在建立$ \ mathsf {w} _p ^ {(\ sigma)} $的基本度量和拓扑属性之后,我们探索$ \ mathsf {w} _p ^ {(\ sigma)}(\ hat {\ mu} _n,\ mu)$,其中$ \ hat {\ mu} _n $是$ n $独立观察的实证分布$ \ mu $。我们证明$ \ mathsf {w} _p ^ {(\ sigma)} $享受$ n ^ { - 1/2} $的参数经验融合速率,这对比$ n ^ { - 1 / d} $率对于未平滑的$ \ mathsf {w} _p $ why $ d \ geq 3 $。我们的证明依赖于控制$ \ mathsf {w} _p ^ {(\ sigma)} $ by $ p $ th-sting spoollow sobolev restion $ \ mathsf {d} _p ^ {(\ sigma)} $并导出限制$ \ sqrt {n} \,\ mathsf {d} _p ^ {(\ sigma)}(\ hat {\ mu} _n,\ mu)$,适用于所有尺寸$ d $。作为应用程序,我们提供了使用$ \ mathsf {w} _p ^ {(\ sigma)} $的两个样本测试和最小距离估计的渐近保证,使用$ p = 2 $的实验使用$ \ mathsf {d} _2 ^ {(\ sigma)} $。
translated by 谷歌翻译
基于内核的测试提供了一个简单而有效的框架,该框架使用繁殖内核希尔伯特空间的理论设计非参数测试程序。在本文中,我们提出了新的理论工具,可用于在几种数据方案以及许多不同的测试问题中研究基于内核测试的渐近行为。与当前的方法不同,我们的方法避免使用冗长的$ u $和$ v $统计信息扩展并限制定理,该定理通常出现在文献中,并直接与希尔伯特空格上的随机功能合作。因此,我们的框架会导致对内核测试的简单明了的分析,只需要轻度的规律条件。此外,我们表明,通常可以通过证明我们方法所需的规律条件既足够又需要进行必要的规律条件来改进我们的分析。为了说明我们的方法的有效性,我们为有条件的独立性测试问题提供了一项新的内核测试,以及针对已知的基于内核测试的新分析。
translated by 谷歌翻译
近年来目睹了采用灵活的机械学习模型进行乐器变量(IV)回归的兴趣,但仍然缺乏不确定性量化方法的发展。在这项工作中,我们为IV次数回归提出了一种新的Quasi-Bayesian程序,建立了最近开发的核化IV模型和IV回归的双/极小配方。我们通过在$ l_2 $和sobolev规范中建立最低限度的最佳收缩率,并讨论可信球的常见有效性来分析所提出的方法的频繁行为。我们进一步推出了一种可扩展的推理算法,可以扩展到与宽神经网络模型一起工作。实证评价表明,我们的方法对复杂的高维问题产生了丰富的不确定性估计。
translated by 谷歌翻译
本文研究了基于Laplacian Eigenmaps(Le)的基于Laplacian EIGENMAPS(PCR-LE)的主要成分回归的统计性质,这是基于Laplacian Eigenmaps(Le)的非参数回归的方法。 PCR-LE通过投影观察到的响应的向量$ {\ bf y} =(y_1,\ ldots,y_n)$ to to changbood图表拉普拉斯的某些特征向量跨越的子空间。我们表明PCR-Le通过SoboLev空格实现了随机设计回归的最小收敛速率。在设计密度$ P $的足够平滑条件下,PCR-le达到估计的最佳速率(其中已知平方$ l ^ 2 $ norm的最佳速率为$ n ^ { - 2s /(2s + d) )} $)和健美的测试($ n ^ { - 4s /(4s + d)$)。我们还表明PCR-LE是\ EMPH {歧管Adaptive}:即,我们考虑在小型内在维度$ M $的歧管上支持设计的情况,并为PCR-LE提供更快的界限Minimax估计($ n ^ { - 2s /(2s + m)$)和测试($ n ^ { - 4s /(4s + m)$)收敛率。有趣的是,这些利率几乎总是比图形拉普拉斯特征向量的已知收敛率更快;换句话说,对于这个问题的回归估计的特征似乎更容易,统计上讲,而不是估计特征本身。我们通过经验证据支持这些理论结果。
translated by 谷歌翻译
在本文中,我们研究了高维条件独立测试,统计和机器学习中的关键构建块问题。我们提出了一种基于双生成对抗性网络(GANS)的推理程序。具体来说,我们首先介绍双GANS框架来学习两个发电机的条件分布。然后,我们将这两个生成器集成到构造测试统计,这采用多个转换函数的广义协方差措施的最大形式。我们还采用了数据分割和交叉拟合来最小化发电机上的条件,以实现所需的渐近属性,并采用乘法器引导来获得相应的$ P $ -Value。我们表明,构造的测试统计数据是双重稳健的,并且由此产生的测试既逆向I误差,并具有渐近的电源。同样的是,与现有测试相比,我们建立了较弱和实际上更可行的条件下的理论保障,我们的提案提供了如何利用某些最先进的深层学习工具(如GAN)的具体示例帮助解决古典但具有挑战性的统计问题。我们通过模拟和应用于抗癌药物数据集来证明我们的测试的疗效。在https://github.com/tianlinxu312/dgcit上提供了所提出的程序的Python实现。
translated by 谷歌翻译