This paper analyzes $\ell_1$ regularized linear regression under the challenging scenario of having only adversarially corrupted data for training. We use the primal-dual witness paradigm to provide provable performance guarantees for the support of the estimated regression parameter vector to match the actual parameter. Our theoretical analysis shows the counter-intuitive result that an adversary can influence sample complexity by corrupting the irrelevant features, i.e., those corresponding to zero coefficients of the regression parameter vector, which, consequently, do not affect the dependent variable. As any adversarially robust algorithm has its limitations, our theoretical analysis identifies the regimes under which the learning algorithm and adversary can dominate over each other. It helps us to analyze these fundamental limits and address critical scientific questions of which parameters (like mutual incoherence, the maximum and minimum eigenvalue of the covariance matrix, and the budget of adversarial perturbation) play a role in the high or low probability of success of the LASSO algorithm. Also, the derived sample complexity is logarithmic with respect to the size of the regression parameter vector, and our theoretical claims are validated by empirical analysis on synthetic and real-world datasets.
translated by 谷歌翻译
Bayesian methods, distributionally robust optimization methods, and regularization methods are three pillars of trustworthy machine learning hedging against distributional uncertainty, e.g., the uncertainty of an empirical distribution compared to the true underlying distribution. This paper investigates the connections among the three frameworks and, in particular, explores why these frameworks tend to have smaller generalization errors. Specifically, first, we suggest a quantitative definition for "distributional robustness", propose the concept of "robustness measure", and formalize several philosophical concepts in distributionally robust optimization. Second, we show that Bayesian methods are distributionally robust in the probably approximately correct (PAC) sense; In addition, by constructing a Dirichlet-process-like prior in Bayesian nonparametrics, it can be proven that any regularized empirical risk minimization method is equivalent to a Bayesian method. Third, we show that generalization errors of machine learning models can be characterized using the distributional uncertainty of the nominal distribution and the robustness measures of these machine learning models, which is a new perspective to bound generalization errors, and therefore, explain the reason why distributionally robust machine learning models, Bayesian models, and regularization models tend to have smaller generalization errors.
translated by 谷歌翻译
在本文中,我们考虑了使用$ \ ell_1 $ regularized logistic回归的方法来估算与高维iSing模型相关的图形的元学习问题,用于每个节点的邻域选择。我们的目标是在学习新任务中使用从辅助任务中学到的信息来降低其足够的样本复杂性。为此,我们提出了一种新颖的生成模型以及不当的估计方法。在我们的设置中,所有任务均为\ emph {相似}在其\ emph {Random}模型参数和支持中。通过将所有样品从辅助任务汇总到\ emph {不正确}估计一个参数向量,我们可以恢复假定的尺寸很小的真实支持联合,具有很高的概率,具有足够的样品复杂性为$ \ omega(1) $每任务,对于$ k = \ omega(d^3 \ log P)$具有$ p $节点和最大邻域大小$ d $的ISING型号的任务。然后,在对新任务的支持仅限于估计的支持联盟的支持下,我们证明,可以通过降低$ \ omega(d^3 \ log d)$的足够样品复杂性来获得新任务的一致邻居选择。
translated by 谷歌翻译
在这项工作中,我们提出了一个强大的框架,该框架采用了对抗性良好的培训来保护机器学习模型免受干扰测试数据。我们通过在模型估计过程中为每个样本的固定预算中纳入最坏情况添加剂的误差来实现这一目标。我们的主要重点是提供一个可以插入的解决方案,该解决方案可以将其纳入现有的机器学习算法中,并具有最小的更改。为此,我们为几种广泛使用的损失函数得出了封闭形式的现成解决方案,并在对抗扰动方面具有各种规范约束。最后,我们通过在现实世界数据集上显示出明显的性能改善,以解决有监督的问题,例如回归和分类,以及无监督的问题,例如矩阵的完成和学习图形模型,而计算上的计算机很少。
translated by 谷歌翻译
我们研究了在高维主成分分析中恢复支持的元学习(即非零条目集)。我们通过从辅助任务中学到的信息来降低新任务中足够的样本复杂性。我们假设每个任务都是具有不同支持的不同随机主组件(PC)矩阵,并且PC矩阵的支持联合很小。然后,我们通过最大化$ l_1 $调查的预测协方差来汇总所有任务中的数据,以执行单个PC矩阵的不当估计,以确定具有很高的概率,只要有足够的任务$ M,就可以恢复真正的支持联盟$和足够数量的样本$ o \ left(\ frac {\ log(p)} {m} \ right)$对于每个任务,对于$ p $ - 维矢量。然后,对于一项新颖的任务,我们证明了$ l_1 $ regularized的预测协方差的最大化,并具有额外的约束,即支持是估计支持联盟的一个子集,可以将成功支持恢复的足够样本复杂性降低到$ o( \ log | j |)$,其中$ j $是从辅助任务中恢复的支持联盟。通常,对于稀疏矩阵而言,$ | j | $将少于$ p $。最后,我们通过数值模拟证明了实验的有效性。
translated by 谷歌翻译
我们研究了从使用套索确定性缺失数据模式控制的相关观察结果中始终如一地恢复回归参数矢量的稀疏模式的问题。我们考虑通过确定性,不均匀过滤器对观察到的数据集进行审查的情况。通过确定性缺失结构恢复数据集中的稀疏模式可以说,比在统一的随机场景中恢复更具挑战性。在本文中,我们通过利用审查过滤器的拓扑特性,提出了一种有效的算法来插入价值的算法。然后,我们提供了新的理论结果,以使用拟议的插补策略来精确恢复稀疏模式。我们的分析表明,在某些统计和拓扑条件下,可以在多项式时间和对数样品复杂性中恢复隐藏的稀疏模式。
translated by 谷歌翻译
We propose the framework of dual convexified convolutional neural networks (DCCNNs). In this framework, we first introduce a primal learning problem motivated by convexified convolutional neural networks (CCNNs), and then construct the dual convex training program through careful analysis of the Karush-Kuhn-Tucker (KKT) conditions and Fenchel conjugates. Our approach reduces the computational overhead of constructing a large kernel matrix and more importantly, eliminates the ambiguity of factorizing the matrix. Due to the low-rank structure in CCNNs and the related subdifferential of nuclear norms, there is no closed-form expression to recover the primal solution from the dual solution. To overcome this, we propose a highly novel weight recovery algorithm, which takes the dual solution and the kernel information as the input, and recovers the linear weight and the output of convolutional layer, instead of weight parameter. Furthermore, our recovery algorithm exploits the low-rank structure and imposes a small number of filters indirectly, which reduces the parameter size. As a result, DCCNNs inherit all the statistical benefits of CCNNs, while enjoying a more formal and efficient workflow.
translated by 谷歌翻译
具有低维结构的随机高维匪徒问题可用于不同的应用程序,例如在线广告和药物发现。在这项工作中,我们为此类问题提出了一种简单的统一算法,并为我们算法的遗憾上限提供了一个一般分析框架。我们表明,在一些温和的统一假设下,我们的算法可以应用于不同的高维匪徒问题。我们的框架利用低维结构来指导问题中的参数估计,因此我们的算法在套索匪徒中达到了可比的遗憾界限,以及低级别矩阵匪徒的新颖界限,组稀疏矩阵强盗和IN组中一个新问题:多代理拉索强盗。
translated by 谷歌翻译
我们表明,从Gaussian内核函数的随机傅里叶功能重建内核矩阵的误差概率是大多数$ \ mathcal {o}(r ^ {2/3} \ exp(-d))$,其中$ d $随机功能的数量和$ r $是数据域的直径。我们还提供了一种无关的$ \ OMEGA的信息理论方法 - 无关的下限((1- \ exp(-R ^ 2))\ exp(-d))$。与事先工作相比,我们是第一个显示随机傅里叶特征的误差概率与数据点的维度无关。作为我们理论的应用,我们为内核Ridge回归和支持向量机获得尺寸无关的界限。
translated by 谷歌翻译
Generalisation to unseen contexts remains a challenge for embodied navigation agents. In the context of semantic audio-visual navigation (SAVi) tasks, the notion of generalisation should include both generalising to unseen indoor visual scenes as well as generalising to unheard sounding objects. However, previous SAVi task definitions do not include evaluation conditions on truly novel sounding objects, resorting instead to evaluating agents on unheard sound clips of known objects; meanwhile, previous SAVi methods do not include explicit mechanisms for incorporating domain knowledge about object and region semantics. These weaknesses limit the development and assessment of models' abilities to generalise their learned experience. In this work, we introduce the use of knowledge-driven scene priors in the semantic audio-visual embodied navigation task: we combine semantic information from our novel knowledge graph that encodes object-region relations, spatial knowledge from dual Graph Encoder Networks, and background knowledge from a series of pre-training tasks -- all within a reinforcement learning framework for audio-visual navigation. We also define a new audio-visual navigation sub-task, where agents are evaluated on novel sounding objects, as opposed to unheard clips of known objects. We show improvements over strong baselines in generalisation to unseen regions and novel sounding objects, within the Habitat-Matterport3D simulation environment, under the SoundSpaces task.
translated by 谷歌翻译