This paper revisits a fundamental problem in statistical inference from a non-asymptotic theoretical viewpoint $\unicode{x2013}$ the construction of confidence sets. We establish a finite-sample bound for the estimator, characterizing its asymptotic behavior in a non-asymptotic fashion. An important feature of our bound is that its dimension dependency is captured by the effective dimension $\unicode{x2013}$ the trace of the limiting sandwich covariance $\unicode{x2013}$ which can be much smaller than the parameter dimension in some regimes. We then illustrate how the bound can be used to obtain a confidence set whose shape is adapted to the optimization landscape induced by the loss function. Unlike previous works that rely heavily on the strong convexity of the loss function, we only assume the Hessian is lower bounded at optimum and allow it to gradually becomes degenerate. This property is formalized by the notion of generalized self-concordance which originated from convex optimization. Moreover, we demonstrate how the effective dimension can be estimated from data and characterize its estimation accuracy. We apply our results to maximum likelihood estimation with generalized linear models, score matching with exponential families, and hypothesis testing with Rao's score test.
translated by 谷歌翻译
Generative AI has matured to a point where large-scale models can generate text that seems indistinguishable from human-written text and remarkably photorealistic images. Automatically measuring how close the distribution of generated data is to the target real data distribution is a key step in diagnosing existing models and developing better models. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore four approaches to statistically estimate these scores: vector quantization, non-parametric estimation, classifier-based estimation, and parametric Gaussian approximations. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of $f$-divergences and statistical estimation methods can quantify the gaps between the distributions of human-written text and those of modern neural language models by correlating with human judgments and identifying known properties of the generated texts. We conclude the paper by demonstrating its applications to other AI domains and discussing practical recommendations.
translated by 谷歌翻译
Spectral risk objectives - also called $L$-risks - allow for learning systems to interpolate between optimizing average-case performance (as in empirical risk minimization) and worst-case performance on a task. We develop stochastic algorithms to optimize these quantities by characterizing their subdifferential and addressing challenges such as biasedness of subgradient estimates and non-smoothness of the objective. We show theoretically and experimentally that out-of-the-box approaches such as stochastic subgradient and dual averaging are hindered by bias and that our approach outperforms them.
translated by 谷歌翻译
Influence diagnostics such as influence functions and approximate maximum influence perturbations are popular in machine learning and in AI domain applications. Influence diagnostics are powerful statistical tools to identify influential datapoints or subsets of datapoints. We establish finite-sample statistical bounds, as well as computational complexity bounds, for influence functions and approximate maximum influence perturbations using efficient inverse-Hessian-vector product implementations. We illustrate our results with generalized linear models and large attention based models on synthetic and real data.
translated by 谷歌翻译
标记医学图像取决于专业知识,因此很难在短时间内以高质量获取大量注释的医学图像。因此,在小型数据集中充分利用有限标记的样品来构建高性能模型是医疗图像分类问题的关键。在本文中,我们提出了一个深入监督的层选择性注意网络(LSANET),该网络全面使用功能级和预测级监督中的标签信息。对于特征级别的监督,为了更好地融合低级功能和高级功能,我们提出了一个新颖的视觉注意模块,层选择性注意(LSA),以专注于不同层的特征选择。 LSA引入了一种权重分配方案,该方案可以在整个训练过程中动态调整每个辅助分支的加权因子,以进一步增强深入监督的学习并确保其概括。对于预测级的监督,我们采用知识协同策略,通过成对知识匹配来促进所有监督分支之间的层次信息互动。使用公共数据集MedMnist,这是用于涵盖多种医学专业的生物医学图像分类的大规模基准,我们评估了LSANET在多个主流CNN体系结构和各种视觉注意模块上评估。实验结果表明,我们所提出的方法对其相应的对应物进行了实质性改进,这表明LSANET可以为医学图像分类领域的标签有效学习提供有希望的解决方案。
translated by 谷歌翻译
在本文中,我们提出了用于滚动快门摄像机的概率连续时间视觉惯性频道(VIO)。连续的时轨迹公式自然促进异步高频IMU数据和运动延伸的滚动快门图像的融合。为了防止棘手的计算负载,提出的VIO是滑动窗口和基于密钥帧的。我们建议概率地将控制点边缘化,以保持滑动窗口中恒定的密钥帧数。此外,可以在我们的连续时间VIO中在线校准滚动快门相机的线曝光时间差(线延迟)。为了广泛检查我们的连续时间VIO的性能,对公共可用的WHU-RSVI,TUM-RSVI和Sensetime-RSVI Rolling快门数据集进行了实验。结果表明,提出的连续时间VIO显着优于现有的最新VIO方法。本文的代码库也将通过\ url {https://github.com/april-zju/ctrl-vio}开源。
translated by 谷歌翻译
随着深度学习的普及,深度学习的硬件实施平台引起了人们的兴趣。与通用设备,例如CPU或GPU不同,在软件级别执行深度学习算法,神经网络硬件加速器直接执行算法,以提高能源效率和性能提高。但是,随着深度学习算法的频繁发展,设计硬件加速器的工程工作和成本大大增加了。为了提高设计质量的同时,提出了神经网络加速器的设计自动化,在该设计空间探索算法被用于在设计空间内自动搜索优化的加速器设计。然而,神经网络加速器的复杂性增加为设计空间带来了不断增加的尺寸。结果,以前的设计空间探索算法不再足够有效,无法找到优化的设计。在这项工作中,我们提出了一个名为Gandse的神经网络加速器设计自动化框架,我们在其中重新考虑了设计空间探索的问题,并提出了一种基于生成对抗网络(GAN)的新方法,以支持高尺寸大型设计的优化探索空间。实验表明,与包括多层感知器和深度强化学习在内的方法相比,甘德能够在可忽略的时间中找到更优化的设计。
translated by 谷歌翻译
并非每个人都可以配备专业的摄影技巧和足够的拍摄时间,并且偶尔会有一些倾斜的图像。在本文中,我们提出了一项名为“旋转校正”的新的实用任务,以自动校正具有较高内容保真度的倾斜度,条件是旋转角度未知。可以轻松地将此任务集成到图像编辑应用程序中,从而使用户无需任何手动操作即可更正旋转的图像。为此,我们利用神经网络来预测可以扭曲倾斜图像的光流,以感知水平。然而,单个图像的像素光流量估计非常不稳定,尤其是在大角度倾斜图像中。为了增强其鲁棒性,我们提出了一种简单但有效的预测策略,以形成强大的弹性经纱。特别是,我们首先回归可以转化为可靠的初始光学流的网格变形。然后,我们估算残留的光流,以促进我们的网络赋予像素变形的灵活性,从而进一步纠正倾斜图像的细节。为了建立评估基准并训练学习框架,在场景和旋转角度上呈现了较大的多样性,呈现了全面的旋转校正数据集。广泛的实验表明,即使在没有角度的情况下,我们的算法也可以超越其他需要此事先的最先进的解决方案。代码和数据集将在https://github.com/nie-lang/rotationCorrection上找到。
translated by 谷歌翻译
正交统计学习和双机器学习已成为在存在滋扰成分的情况下,作为两阶段统计预测的一般框架。我们对具有满足自我符合性能的损失功能的正交统计学习方法的过量风险建立了非扰动界限。我们的界限在提升强凸度的假设时,通过维数因子来改善现有界限。我们用来自多个治疗效应估计的示例和广义部分线性建模来说明结果。
translated by 谷歌翻译
We propose Distribution Embedding Networks (DEN) for classification with small data. In the same spirit of meta-learning, DEN learns from a diverse set of training tasks with the goal to generalize to unseen target tasks. Unlike existing approaches which require the inputs of training and target tasks to have the same dimension with possibly similar distributions, DEN allows training and target tasks to live in heterogeneous input spaces. This is especially useful for tabular-data tasks where labeled data from related tasks are scarce. DEN uses a three-block architecture: a covariate transformation block followed by a distribution embedding block and then a classification block. We provide theoretical insights to show that this architecture allows the embedding and classification blocks to be fixed after pre-training on a diverse set of tasks; only the covariate transformation block with relatively few parameters needs to be fine-tuned for each new task. To facilitate training, we also propose an approach to synthesize binary classification tasks, and demonstrate that DEN outperforms existing methods in a number of synthetic and real tasks in numerical studies.
translated by 谷歌翻译