剩下的交叉验证(LOO-CV)是一种估计样本外预测准确性的流行方法。但是,由于需要多次拟合模型,因此计算LOO-CV标准在计算上可能很昂贵。在贝叶斯的情况下,重要性采样提供了一种可能的解决方案,但是经典方法可以轻松地产生差异是无限的估计器,从而使它们可能不可靠。在这里,我们提出和分析一种新型混合估计量来计算贝叶斯Loo-CV标准。我们的方法保留了经典方法的简单性和计算便利性,同时保证了所得估计器的有限差异。提供了理论和数值结果,以说明提高的鲁棒性和效率。在高维问题中,计算益处尤为重要,可以为更广泛的模型执行贝叶斯loo-CV。所提出的方法可以在标准概率编程软件中很容易实现,并且计算成本大致相当于拟合原始模型一次。
translated by 谷歌翻译
我们研究Livingstone&Zanella(2021)中引入的一阶级本地平衡的大都市 - 黑斯廷斯算法(2021)。要在类中选择特定算法,用户必须选择平衡函数$ g:\ mathbb {r} \ to \ mathbb {r} $满足$ g(t)= tg(1 / t)$,以及噪声分布提案增量。课程中的流行选择是Metropolis调整的Langevin算法,最近推出的Barker提案。我们首先建立一个普遍限制的最佳验收率为57%,并为N $ N $的缩放,因为维度在$ G $的温和平滑假设下的所有成员之间的无限程度倾向于无限算法的目标分布是产品形式。特别地,我们通过预期的平方跳跃距离来获得类中任意算法的渐近效率的显式表达式。然后,我们考虑如何在各种约束下优化此表达式。我们为Barker提案提供了最佳的噪声分布选择,在高斯噪声分布​​下的平衡功能的最佳选择,以及整个类中的一阶本地平衡算法的最佳选择,结果取决于特定的目标分布。数值模拟确认了我们的理论发现,特别表明,Barker提案中的双模噪声分布选择产生了比原始高斯版本始终如一的效率的实用算法。
translated by 谷歌翻译
Structural Health Monitoring (SHM) describes a process for inferring quantifiable metrics of structural condition, which can serve as input to support decisions on the operation and maintenance of infrastructure assets. Given the long lifespan of critical structures, this problem can be cast as a sequential decision making problem over prescribed horizons. Partially Observable Markov Decision Processes (POMDPs) offer a formal framework to solve the underlying optimal planning task. However, two issues can undermine the POMDP solutions. Firstly, the need for a model that can adequately describe the evolution of the structural condition under deterioration or corrective actions and, secondly, the non-trivial task of recovery of the observation process parameters from available monitoring data. Despite these potential challenges, the adopted POMDP models do not typically account for uncertainty on model parameters, leading to solutions which can be unrealistically confident. In this work, we address both key issues. We present a framework to estimate POMDP transition and observation model parameters directly from available data, via Markov Chain Monte Carlo (MCMC) sampling of a Hidden Markov Model (HMM) conditioned on actions. The MCMC inference estimates distributions of the involved model parameters. We then form and solve the POMDP problem by exploiting the inferred distributions, to derive solutions that are robust to model uncertainty. We successfully apply our approach on maintenance planning for railway track assets on the basis of a "fractal value" indicator, which is computed from actual railway monitoring data.
translated by 谷歌翻译
The polynomial kernels are widely used in machine learning and they are one of the default choices to develop kernel-based classification and regression models. However, they are rarely used and considered in numerical analysis due to their lack of strict positive definiteness. In particular they do not enjoy the usual property of unisolvency for arbitrary point sets, which is one of the key properties used to build kernel-based interpolation methods. This paper is devoted to establish some initial results for the study of these kernels, and their related interpolation algorithms, in the context of approximation theory. We will first prove necessary and sufficient conditions on point sets which guarantee the existence and uniqueness of an interpolant. We will then study the Reproducing Kernel Hilbert Spaces (or native spaces) of these kernels and their norms, and provide inclusion relations between spaces corresponding to different kernel parameters. With these spaces at hand, it will be further possible to derive generic error estimates which apply to sufficiently smooth functions, thus escaping the native space. Finally, we will show how to employ an efficient stable algorithm to these kernels to obtain accurate interpolants, and we will test them in some numerical experiment. After this analysis several computational and theoretical aspects remain open, and we will outline possible further research directions in a concluding section. This work builds some bridges between kernel and polynomial interpolation, two topics to which the authors, to different extents, have been introduced under the supervision or through the work of Stefano De Marchi. For this reason, they wish to dedicate this work to him in the occasion of his 60th birthday.
translated by 谷歌翻译
The proliferation of deep learning techniques led to a wide range of advanced analytics applications in important business areas such as predictive maintenance or product recommendation. However, as the effectiveness of advanced analytics naturally depends on the availability of sufficient data, an organization's ability to exploit the benefits might be restricted by limited data or likewise data access. These challenges could force organizations to spend substantial amounts of money on data, accept constrained analytics capacities, or even turn into a showstopper for analytics projects. Against this backdrop, recent advances in deep learning to generate synthetic data may help to overcome these barriers. Despite its great potential, however, synthetic data are rarely employed. Therefore, we present a taxonomy highlighting the various facets of deploying synthetic data for advanced analytics systems. Furthermore, we identify typical application scenarios for synthetic data to assess the current state of adoption and thereby unveil missed opportunities to pave the way for further research.
translated by 谷歌翻译
To make machine learning (ML) sustainable and apt to run on the diverse devices where relevant data is, it is essential to compress ML models as needed, while still meeting the required learning quality and time performance. However, how much and when an ML model should be compressed, and {\em where} its training should be executed, are hard decisions to make, as they depend on the model itself, the resources of the available nodes, and the data such nodes own. Existing studies focus on each of those aspects individually, however, they do not account for how such decisions can be made jointly and adapted to one another. In this work, we model the network system focusing on the training of DNNs, formalize the above multi-dimensional problem, and, given its NP-hardness, formulate an approximate dynamic programming problem that we solve through the PACT algorithmic framework. Importantly, PACT leverages a time-expanded graph representing the learning process, and a data-driven and theoretical approach for the prediction of the loss evolution to be expected as a consequence of training decisions. We prove that PACT's solutions can get as close to the optimum as desired, at the cost of an increased time complexity, and that, in any case, such complexity is polynomial. Numerical results also show that, even under the most disadvantageous settings, PACT outperforms state-of-the-art alternatives and closely matches the optimal energy cost.
translated by 谷歌翻译
In the past few years, Deep Reinforcement Learning (DRL) has become a valuable solution to automatically learn efficient resource management strategies in complex networks. In many scenarios, the learning task is performed in the Cloud, while experience samples are generated directly by edge nodes or users. Therefore, the learning task involves some data exchange which, in turn, subtracts a certain amount of transmission resources from the system. This creates a friction between the need to speed up convergence towards an effective strategy, which requires the allocation of resources to transmit learning samples, and the need to maximize the amount of resources used for data plane communication, maximizing users' Quality of Service (QoS), which requires the learning process to be efficient, i.e., minimize its overhead. In this paper, we investigate this trade-off and propose a dynamic balancing strategy between the learning and data planes, which allows the centralized learning agent to quickly converge to an efficient resource allocation strategy while minimizing the impact on QoS. Simulation results show that the proposed method outperforms static allocation methods, converging to the optimal policy (i.e., maximum efficacy and minimum overhead of the learning plane) in the long run.
translated by 谷歌翻译
Learned Bloom Filters, i.e., models induced from data via machine learning techniques and solving the approximate set membership problem, have recently been introduced with the aim of enhancing the performance of standard Bloom Filters, with special focus on space occupancy. Unlike in the classical case, the "complexity" of the data used to build the filter might heavily impact on its performance. Therefore, here we propose the first in-depth analysis, to the best of our knowledge, for the performance assessment of a given Learned Bloom Filter, in conjunction with a given classifier, on a dataset of a given classification complexity. Indeed, we propose a novel methodology, supported by software, for designing, analyzing and implementing Learned Bloom Filters in function of specific constraints on their multi-criteria nature (that is, constraints involving space efficiency, false positive rate, and reject time). Our experiments show that the proposed methodology and the supporting software are valid and useful: we find out that only two classifiers have desirable properties in relation to problems with different data complexity, and, interestingly, none of them has been considered so far in the literature. We also experimentally show that the Sandwiched variant of Learned Bloom filters is the most robust to data complexity and classifier performance variability, as well as those usually having smaller reject times. The software can be readily used to test new Learned Bloom Filter proposals, which can be compared with the best ones identified here.
translated by 谷歌翻译
Every automaton can be decomposed into a cascade of basic automata. This is the Prime Decomposition Theorem by Krohn and Rhodes. We show that cascades allow for describing the sample complexity of automata in terms of their components. In particular, we show that the sample complexity is linear in the number of components and the maximum complexity of a single component, modulo logarithmic factors. This opens to the possibility of learning automata representing large dynamical systems consisting of many parts interacting with each other. It is in sharp contrast with the established understanding of the sample complexity of automata, described in terms of the overall number of states and input letters, which implies that it is only possible to learn automata where the number of states is linear in the amount of data available. Instead our results show that one can learn automata with a number of states that is exponential in the amount of data available.
translated by 谷歌翻译
In this work, we apply a kinetic version of a bounded confidence consensus model to biomedical segmentation problems. In the presented approach, time-dependent information on the microscopic state of each particle/pixel includes its space position and a feature representing a static characteristic of the system, i.e. the gray level of each pixel. From the introduced microscopic model we derive a kinetic formulation of the model. The large time behavior of the system is then computed with the aid of a surrogate Fokker-Planck approach that can be obtained in the quasi-invariant scaling. We exploit the computational efficiency of direct simulation Monte Carlo methods for the obtained Boltzmann-type description of the problem for parameter identification tasks. Based on a suitable loss function measuring the distance between the ground truth segmentation mask and the evaluated mask, we minimize the introduced segmentation metric for a relevant set of 2D gray-scale images. Applications to biomedical segmentation concentrate on different imaging research contexts.
translated by 谷歌翻译