智能论文笔记

Fast variable selection makes Karhunen-Loève decomposed Gaussian process BSS-ANOVA a speedy and accurate choice for dynamic systems identification

David S. Mebane , Kyle Hayes , Ali Baheri

分类：机器学习 | (统计)机器学习

2022-05-26

可伸缩GP的许多方法都集中在使用数据子集作为诱导点。另一个有前途的方法是Karhunen-lo \'EVE（KL）分解，其中GP内核由一组基础函数表示，这些函数是内核操作员的特征函数。这样的内核有可能非常快，并且不依赖于选择减少的诱导点的选择。但是，KL分解导致高维度，因此变量选择变得至关重要。本文报告了一种新的前向变量选择方法，该方法由贝叶斯平滑样条链条方差分析核（BSS-Anova）的KL扩展中的基础函数的有序性质启用，并在完全贝叶斯方法中与快速的Gibbs采样。新算法确定了包括条款的订单应达到的高度，使用$ l^0 $惩罚在贝叶斯和Akaike信息标准中固定的模型复杂度平衡。推理速度和准确性使该方法通过将动态系统中的导数建模为静态问题，然后使用高阶方案集成学习动力学，从而使该方法特别有用。这些方法在两个动态数据集上进行了证明：一个“易感性，感染，回收”的玩具问题，以及用作强迫函数的传递性以及实验性的“级联罐”基准数据集。对衍生物的静态预测进行比较是用随机森林（RF），残留神经网络（RESNET）和正交添加剂（OAK）诱导可伸缩GP进行的，而对于时间表的预测比较，则与LSTM和GRU进行比较复发性神经网络（RNN）。

translated by 谷歌翻译

A Latent Restoring Force Approach to Nonlinear System Identification

Timothy J. Rogers , Tobias Friis

分类： (统计)机器学习 | 机器学习

2021-09-22

非线性动态系统的识别仍然是整个工程的重大挑战。这项工作提出了一种基于贝叶斯过滤的方法，以提取和确定系统中未知的非线性项的贡献，可以将其视为恢复力表面类型方法的替代观点。为了实现这种识别，最初将非线性恢复力的贡献作为高斯过程建模。该高斯过程将转换为状态空间模型，并与系统的线性动态组件结合使用。然后，通过推断过滤和平滑分布，可以提取系统的内部状态和非线性恢复力。在这些状态下，可以构建非线性模型。在模拟案例研究和实验基准数据集中，该方法被证明是有效的。

translated by 谷歌翻译

Hida-Matérn Kernel

Matthew Dowling , Piotr Sokół , Il Memming Park

分类： (统计)机器学习 | 机器学习

2021-07-15

我们介绍了Hida-Mat'Ern内核的班级，这是整个固定式高斯 - 马尔可夫流程的整个空间的规范家庭协方差。它在垫子内核上延伸，通过允许灵活地构造具有振荡组件的过程。任何固定内核，包括广泛使用的平方指数和光谱混合核，要么直接在该类内，也是适当的渐近限制，展示了该类的一般性。利用其Markovian Nature，我们展示了如何仅使用内核及其衍生物来代表状态空间模型的过程。反过来，这使我们能够更有效地执行高斯工艺推论，并且侧面通常计算负担。我们还表明，除了进一步减少计算复杂性之外，我们还显示了如何利用状态空间表示的特殊属性。

translated by 谷歌翻译

Scalable mixed-domain Gaussian processes

Juho Timonen , Harri Lähdesmäki

分类：机器学习

2021-11-03

高斯过程（GP），其结合了分类和连续输入变量模型已发现使用例如在纵向数据分析和计算机实验。然而，对于这些模型标准推理具有典型的立方缩放，并且不能应用于GPS共可扩展近似方案自协方差函数是不连续的。在这项工作中，我们导出用于混合域协方差函数，其中对于观察和基函数总数的数量成线性比例的基础函数近似方案。所提出的方法自然是适用于GP贝叶斯回归任意观测模型。我们证明在纵向数据建模上下文和显示的方法，它精确地近似于确切GP模型，只需要一个比较拟合对应精确模型运行时间的几分之一。

translated by 谷歌翻译

Sparse Gaussian Process Hyperparameters: Optimize or Integrate?

Vidhi Lalchand , Wessel P. Bruinsma , David R. Burt , Carl E. Rasmussen

分类： (统计)机器学习 | 机器学习

2022-11-04

The kernel function and its hyperparameters are the central model selection choice in a Gaussian proces (Rasmussen and Williams, 2006). Typically, the hyperparameters of the kernel are chosen by maximising the marginal likelihood, an approach known as Type-II maximum likelihood (ML-II). However, ML-II does not account for hyperparameter uncertainty, and it is well-known that this can lead to severely biased estimates and an underestimation of predictive uncertainty. While there are several works which employ a fully Bayesian characterisation of GPs, relatively few propose such approaches for the sparse GPs paradigm. In this work we propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior within the variational inducing point framework of Titsias (2009). This work is closely related to Hensman et al. (2015b) but side-steps the need to sample the inducing points, thereby significantly improving sampling efficiency in the Gaussian likelihood case. We compare this scheme against natural baselines in literature along with stochastic variational GPs (SVGPs) along with an extensive computational analysis.

translated by 谷歌翻译

Shallow and Deep Nonparametric Convolutions for Gaussian Processes

Thomas M. McDonald , Magnus Ross , Michael T. Smith , Mauricio A. Álvarez

分类： (统计)机器学习 | 机器学习

2022-06-17

高斯流程（GPS）实际应用的主要挑战是选择适当的协方差函数。 GPS的移动平均值或过程卷积的构建可以提供一些额外的灵活性，但仍需要选择合适的平滑核，这是非平凡的。以前的方法通过在平滑内核上使用GP先验，并通过扩展协方差来构建协方差函数，以绕过预先指定它的需求。但是，这样的模型在几种方面受到限制：它们仅限于单维输入，例如时间;它们仅允许对单个输出进行建模，并且由于推理并不简单，因此不会扩展到大型数据集。在本文中，我们引入了GPS的非参数过程卷积公式，该公式通过使用基于Matheron规则的功能采样方法来减轻这些弱点，以使用诱导变量的间域间采样进行快速采样。此外，我们提出了这些非参数卷积的组成，可作为经典深度GP模型的替代方案，并允许从数据中推断中间层的协方差函数。我们测试了单个输出GP，多个输出GPS和DEEP GPS在基准测试上的模型性能，并发现在许多情况下，我们的方法可以提供比标准GP模型的改进。

translated by 谷歌翻译

Deep Kernel Learning

Andrew Gordon Wilson , Zhiting Hu , Ruslan Salakhutdinov , Eric P. Xing

分类：

2015-11-06

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation. These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability. We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process. Inference and learning cost O(n) for n training points, and predictions cost O(1) per test point. On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.

translated by 谷歌翻译

Additive Gaussian Processes Revisited

Xiaoyu Lu , Alexis Boukouvalas , James Hensman

分类： (统计)机器学习 | 机器学习

2022-06-20

高斯流程（GP）模型是一类灵活的非参数模型，具有丰富的代表力。通过使用具有添加剂结构的高斯工艺，可以在保持解释性的同时对复杂的响应进行建模。先前的工作表明，加性高斯工艺模型需要高维相互作用项。我们提出了正交添加剂（OAK），该核（OAK）对添加功能施加正交性约束，从而实现了功能关系的可识别，低维表示。我们将OAK内核连接到功能方差分析分解，并显示出稀疏计算方法的收敛速率。与黑盒模型相比，我们只有少量的添加剂低维术语，在保持可解释性的同时，橡木模型的预测性能相似或更好。

translated by 谷歌翻译

The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time

Raj Agrawal , Tamara Broderick

分类： (统计)机器学习

2021-06-23

Many scientific problems require identifying a small set of covariates that are associated with a target response and estimating their effects. Often, these effects are nonlinear and include interactions, so linear and additive methods can lead to poor estimation and variable selection. Unfortunately, methods that simultaneously express sparsity, nonlinearity, and interactions are computationally intractable -- with runtime at least quadratic in the number of covariates, and often worse. In the present work, we solve this computational bottleneck. We show that suitable interaction models have a kernel representation, namely there exists a "kernel trick" to perform variable selection and estimation in $O$(# covariates) time. Our resulting fit corresponds to a sparse orthogonal decomposition of the regression function in a Hilbert space (i.e., a functional ANOVA decomposition), where interaction effects represent all variation that cannot be explained by lower-order effects. On a variety of synthetic and real data sets, our approach outperforms existing methods used for large, high-dimensional data sets while remaining competitive (or being orders of magnitude faster) in runtime.

translated by 谷歌翻译

Fast emulation of density functional theory simulations using approximate Gaussian processes

Steven Stetzler , Michael Grosskopf , Earl Lawrence

分类： (统计)机器学习 | 机器学习

2022-08-24

使用马尔可夫链蒙特卡洛（Monte Carlo）以贝叶斯方式将理论模型拟合到实验数据中，通常需要一个评估数千（或数百万）型的型号。当模型是慢速到计算的物理模拟时，贝叶斯模型拟合就变得不可行。为了解决这个问题，可以使用模拟输出的第二个统计模型，该模型可以用来代替模型拟合期间的完整仿真。选择的典型仿真器是高斯过程（GP），这是一种灵活的非线性模型，在每个输入点提供了预测均值和方差。高斯流程回归对少量培训数据（$ n <10^3 $）非常有效，但是当数据集大小变大时，训练和用于预测的速度慢。可以使用各种方法来加快中高级数据集制度（$ n> 10^5 $）的加快高斯流程，从而使人们的预测准确性大大降低了。这项工作研究了几种近似高斯过程模型的准确度折叠 - 稀疏的变异GP，随机变异GP和深内核学习的GP - 在模拟密度功能理论（DFT）模型的预测时。此外，我们使用模拟器以贝叶斯的方式校准DFT模型参数，使用观察到的数据，解决数据集大小所施加的计算屏障，并将校准结果与先前的工作进行比较。这些校准的DFT模型的实用性是根据观察到的数据对实验意义的核素的性质进行预测，例如超重核。

translated by 谷歌翻译

Non-Gaussian Process Regression

Yaman Kındap , Simon Godsill

分类： (统计)机器学习 | 机器学习

2022-09-07

标准GPS为行为良好的流程提供了灵活的建模工具。然而，预计与高斯的偏差有望在现实世界数据集中出现，结构异常值和冲击通常会观察到。在这些情况下，GP可能无法充分建模不确定性，并且可能会过度推动。在这里，我们将GP框架扩展到一类新的时间变化的GP，从而可以直接建模重尾非高斯行为，同时通过非均匀GPS表示的无限混合物保留了可拖动的条件GP结构。有条件的GP结构是通过在潜在转化的输入空间上调节观测值来获得的，并使用L \'{e} Vy过程对潜在转化的随机演变进行建模，该过程允许贝叶斯在后端预测密度和潜在转化中的贝叶斯推断功能。我们为该模型提供了马尔可夫链蒙特卡洛推理程序，并证明了与标准GP相比的潜在好处。

translated by 谷歌翻译

Fast and Scalable Spike and Slab Variable Selection in High-Dimensional Gaussian Processes

Hugh Dance , Brooks Paige

分类： (统计)机器学习 | 机器学习

2021-11-08

高斯过程中的变量选择（GPS）通常通过阈值平衡“自动相关性确定”内核的逆宽度，但在高维数据集中，这种方法可能是不可靠的。更概率的原则性的替代方案是使用尖峰和平板前沿并推断可变包裹物的后验概率。但是，GPS中的现有实现是以高维和大量$ N $数据集运行的昂贵，或者对于大多数内核都是棘手的。因此，我们为具有任意微分内核的秒杀和平板GP开发了一种快速且可扩展的变分推理算法。我们提高了算法通过贝叶斯模型对普遍存在的模型进行平均来适应相关变量的稀疏性的能力，并使用零温度后部限制，辍学灌注和最近的邻米匹配来实现大量速度UPS。在实验中，我们的方法始终如一地优于Vanilla和稀疏变分的GPS，同时保留类似的运行时间（即使是N = 10 ^ 6美元），并且使用MCMC使用Spike和Slab GP竞争地执行，但速度最高可达1000美元。

translated by 谷歌翻译

Battery Degradation Long-term Forecast Using Gaussian Process Dynamical Models and Knowledge Transfer

Ziyang Zhang , Akeel Shah , Wei W. Xing

分类：机器学习

2022-12-03

Batteries plays an essential role in modern energy ecosystem and are widely used in daily applications such as cell phones and electric vehicles. For many applications, the health status of batteries plays a critical role in the performance of the system by indicating efficient maintenance and on-time replacement. Directly modeling an individual battery using a computational models based on physical rules can be of low-efficiency, in terms of the difficulties in build such a model and the computational effort of tuning and running it especially on the edge. With the rapid development of sensor technology (to provide more insights into the system) and machine learning (to build capable yet fast model), it is now possible to directly build a data-riven model of the battery health status using the data collected from historical battery data (being possibly local and remote) to predict local battery health status in the future accurately. Nevertheless, most data-driven methods are trained based on the local battery data and lack the ability to extract common properties, such as generations and degradation, in the life span of other remote batteries. In this paper, we utilize a Gaussian process dynamical model (GPDM) to build a data-driven model of battery health status and propose a knowledge transfer method to extract common properties in the life span of all batteries to accurately predict the battery health status with and without features extracted from the local battery. For modern benchmark problems, the proposed method outperform the state-of-the-art methods with significant margins in terms of accuracy and is able to accuracy predict the regeneration process.

translated by 谷歌翻译

Sensitivity Prewarping for Local Surrogate Modeling

Nathan Wycoff , Mickaël Binois , Robert B. Gramacy

分类： (统计)机器学习 | 机器学习

2021-01-15

在不断努力提高产品质量和降低运营成本中，越来越多地部署计算建模以确定产品设计或配置的可行性。通过本地模型代理这些计算机实验的建模，仅考虑短程交互，诱导稀疏性，可以解决复杂输入输出关系的巨大分析。然而，缩小到地方规模的重点意味着必须一遍又一遍地重新学习全球趋势。在本文中，我们提出了一种框架，用于将来自全局敏感性分析的信息纳入代理模型作为输入旋转和重新扫描预处理步骤。我们讨论了基于内核回归的几个敏感性分析方法的关系在描述它们如何产生输入变量的转换之前。具体而言，我们执行输入扭曲，使得“翘曲模拟器”对所有输入方向同样敏感，释放本地模型以专注于本地动态。观测数据和基准测试功能的数值实验，包括来自汽车行业的高维计算机模拟器，提供了实证验证。

translated by 谷歌翻译

Dynamic Bayesian Learning and Calibration of Spatiotemporal Mechanistic System

Ian Frankenburg , Sudipto Banerjee

分类： (统计)机器学习

2022-08-12

我们开发了一种基于嘈杂观测值的时空动力学模型的完全贝叶斯学习和校准的方法。通过将观察到的数据与机械系统的模拟计算机实验融合信息来实现校准。联合融合使用高斯和非高斯州空间方法以及高斯工艺回归。假设动态系统受到有限的输入收集的控制，高斯过程回归通过许多训练运行来了解这些参数的效果，从而推动了时空状态空间组件的随机创新。这可以在空间和时间上对动态进行有效的建模。通过减少的高斯过程和共轭模型规范，我们的方法适用于大规模校准和反问题。我们的方法是一般，可扩展的，并且能够学习具有潜在模型错误指定的各种动力系统。我们通过解决普通和部分非线性微分方程的分析中产生的反问题来证明这种灵活性，此外，还可以在网络上生成时空动力学的黑盒计算机模型。

translated by 谷歌翻译

Discovering and forecasting extreme events via active learning in neural operators

Ethan Pickering , Stephen Guth , George Em Karniadakis , Themistoklis P. Sapsis

分类：机器学习 | (统计)机器学习

2022-04-05

社会和自然中的极端事件，例如大流行尖峰，流氓波浪或结构性失败，可能会带来灾难性的后果。极端的表征很困难，因为它们很少出现，这似乎是由良性的条件引起的，并且属于复杂且通常是未知的无限维系统。这种挑战使他们将其描述为“毫无意义”。我们通过将贝叶斯实验设计（BED）中的新型训练方案与深神经操作员（DNOS）合奏结合在一起来解决这些困难。这个模型不足的框架配对了一个床方案，该床方案积极选择数据以用近似于无限二二维非线性运算符的DNO集合来量化极端事件。我们发现，这个框架不仅清楚地击败了高斯流程（GPS），而且只有两个成员的浅色合奏表现最好； 2）无论初始数据的状态如何（即有或没有极端），都会发现极端； 3）我们的方法消除了“双研究”现象； 4）与逐步全球Optima相比，使用次优的采集点的使用不会阻碍床的性能； 5）蒙特卡洛的获取优于高量级的标准优化器。这些结论共同构成了AI辅助实验基础设施的基础，该基础设施可以有效地推断并查明从物理到社会系统的许多领域的关键情况。

translated by 谷歌翻译

Valid prediction intervals for regression problems

Nicolas Dewolf , Bernard De Baets , Willem Waegeman

分类： (统计)机器学习 | 机器学习

2021-07-01

在过去几十年中，已经提出了各种方法，用于估计回归设置中的预测间隔，包括贝叶斯方法，集合方法，直接间隔估计方法和保形预测方法。重要问题是这些方法的校准：生成的预测间隔应该具有预定义的覆盖水平，而不会过于保守。在这项工作中，我们从概念和实验的角度审查上述四类方法。结果来自各个域的基准数据集突出显示从一个数据集中的性能的大波动。这些观察可能归因于违反某些类别的某些方法所固有的某些假设。我们说明了如何将共形预测用作提供不具有校准步骤的方法的方法的一般校准程序。

translated by 谷歌翻译

Deep Neural Networks as Point Estimates for Deep Gaussian Processes

Vincent Dutordoir , James Hensman , Mark van der Wilk , Carl Henrik Ek , Zoubin Ghahramani , Nicolas Durrande

分类： (统计)机器学习 | 机器学习

2021-05-10

神经网络和高斯过程的优势和劣势是互补的。更好地了解他们的关系伴随着使每个方法从另一个方法中受益的承诺。在这项工作中，我们建立了神经网络的前进通行证与（深）稀疏高斯工艺模型之间的等价。我们开发的理论是基于解释激活函数作为跨域诱导功能，通过对激活函数和内核之间的相互作用进行严格分析。这导致模型可以被视为具有改善的不确定性预测或深度高斯过程的神经网络，其具有提高的预测精度。这些权利要求通过对回归和分类数据集进行实验结果来支持。

translated by 谷歌翻译

Noise Estimation in Gaussian Process Regression

Siavash Ameli , Shawn C. Shadden

分类：机器学习 | (统计)机器学习

2022-06-20

我们开发了一个计算程序，以估计具有附加噪声的半摩托车高斯过程回归模型的协方差超参数。也就是说，提出的方法可用于有效估计相关误差的方差，以及基于最大化边际似然函数的噪声方差。我们的方法涉及适当地降低超参数空间的维度，以简化单变量的根发现问题的估计过程。此外，我们得出了边际似然函数及其衍生物的边界和渐近线，这对于缩小高参数搜索的初始范围很有用。使用数值示例，我们证明了与传统参数优化相比，提出方法的计算优势和鲁棒性。

translated by 谷歌翻译

Fully Bayesian inference for latent variable Gaussian process models

Suraj Yerramilli , Akshay Iyer , Wei Chen , Daniel W. Apley

分类： (统计)机器学习 | 机器学习

2022-11-04

Real engineering and scientific applications often involve one or more qualitative inputs. Standard Gaussian processes (GPs), however, cannot directly accommodate qualitative inputs. The recently introduced latent variable Gaussian process (LVGP) overcomes this issue by first mapping each qualitative factor to underlying latent variables (LVs), and then uses any standard GP covariance function over these LVs. The LVs are estimated similarly to the other GP hyperparameters through maximum likelihood estimation, and then plugged into the prediction expressions. However, this plug-in approach will not account for uncertainty in estimation of the LVs, which can be significant especially with limited training data. In this work, we develop a fully Bayesian approach for the LVGP model and for visualizing the effects of the qualitative inputs via their LVs. We also develop approximations for scaling up LVGPs and fully Bayesian inference for the LVGP hyperparameters. We conduct numerical studies comparing plug-in inference against fully Bayesian inference over a few engineering models and material design applications. In contrast to previous studies on standard GP modeling that have largely concluded that a fully Bayesian treatment offers limited improvements, our results show that for LVGP modeling it offers significant improvements in prediction accuracy and uncertainty quantification over the plug-in approach.

translated by 谷歌翻译