Spectral methods provide consistent estimators for community detection in dense graphs. However, their performance deteriorates as the graphs become sparser. In this work we consider a random graph model that can produce graphs at different levels of sparsity, and we show that graph neural networks can outperform spectral methods on sparse graphs. We illustrate the results with numerical examples in both synthetic and real graphs.
translated by 谷歌翻译
图形神经网络(GNNS)是由图形卷积和叉指非线性组成的层组成的深度卷积架构。由于其不变性和稳定性属性,GNN在网络数据的学习陈述中被证明是成功的。但是,训练它们需要矩阵计算,这对于大图可能是昂贵的。为了解决这个限制,我们研究了GNN横跨图形转移的能力。我们考虑图形,这是加权和随机图形的图形限制和生成模型,以定义图形卷积和GNNS - Graphon卷曲和Graphon神经网络(WNNS)的限制对象 - 我们用作图形卷曲的生成模型和GNNS。我们表明,这些石墨源区和WNN可以通过图形滤波器和来自加权和随机图中的它们采样的GNN来近似。使用这些结果,我们将导出误差界限,用于跨越此类图形传输图形过滤器和GNN。这些界限表明,可转换性随着图尺寸的增加而增加,并且揭示了在GNN中的可转换性和光谱分辨率之间的折衷,其被点亮的非线性缓解。这些发现经验在电影推荐和分散机器人控制中的数值实验中进行了经验验证。
translated by 谷歌翻译
深度学习模型概括到分销数据很好,但扭动概括为合作方式,即结合一组学习的原语来解决更复杂的任务。以顺序到序列(SEQ2SEQ)学习,变压器通常无法预测比在训练中看到的更长示例的正确输出。本文介绍了迭代解码,SEQ2SEQ的替代方案(i)改善了PCFG和笛卡尔产品数据集中的变压器组成概括和(ii)在这些数据集中的证据中,SEQ2Seq变压器不学习未展开的迭代。在迭代解码中,训练示例被分解为变压器迭代地学习的一系列中间步骤。在推断时间下,中间输出被馈送回变压器,直到预测迭代令牌结束令牌。我们通过说明CFQ数据集中的迭代解码的一些限制来得出结论。
translated by 谷歌翻译
图形神经网络(GNNS)使用图形卷积来利用网络不向导并从网络数据中学习有意义的特征表示。但是,在大规模图中,卷积以高计算成本产生,导致可伸缩性限制。在本文中,我们考虑了学习图形神经网络(WNN)的问题 - GNN的极限对象 - 通过训练从Graphon采样的图形上,我们考虑了学习GragraN神经网络(WNN)的问题。在平滑性条件下,我们表明:(i)GNN和WNN上的学习步骤之间的预期距离随图形的尺寸渐近地降低,并且(ii)在一系列生长图上训练时,梯度下降遵循WNN的学习方向。受这些结果的启发,我们提出了一种新型算法,以学习大规模图的GNN,从中等数量的节点开始,在训练过程中依次增加了图的大小。该算法是在分散的控制问题上进一步基准的,在该问题下,它以降低的计算成本保留了与大规模对应物相当的性能。
translated by 谷歌翻译
Training large, deep neural networks to convergence can be prohibitively expensive. As a result, often only a small selection of popular, dense models are reused across different contexts and tasks. Increasingly, sparsely activated models, which seek to decouple model size from computation costs, are becoming an attractive alternative to dense models. Although more efficient in terms of quality and computation cost, sparse models remain data-hungry and costly to train from scratch in the large scale regime. In this work, we propose sparse upcycling -- a simple way to reuse sunk training costs by initializing a sparsely activated Mixture-of-Experts model from a dense checkpoint. We show that sparsely upcycled T5 Base, Large, and XL language models and Vision Transformer Base and Large models, respectively, significantly outperform their dense counterparts on SuperGLUE and ImageNet, using only ~50% of the initial dense pretraining sunk cost. The upcycled models also outperform sparse models trained from scratch on 100% of the initial dense pretraining computation budget.
translated by 谷歌翻译
Most benchmarks for studying surgical interventions focus on a specific challenge instead of leveraging the intrinsic complementarity among different tasks. In this work, we present a new experimental framework towards holistic surgical scene understanding. First, we introduce the Phase, Step, Instrument, and Atomic Visual Action recognition (PSI-AVA) Dataset. PSI-AVA includes annotations for both long-term (Phase and Step recognition) and short-term reasoning (Instrument detection and novel Atomic Action recognition) in robot-assisted radical prostatectomy videos. Second, we present Transformers for Action, Phase, Instrument, and steps Recognition (TAPIR) as a strong baseline for surgical scene understanding. TAPIR leverages our dataset's multi-level annotations as it benefits from the learned representation on the instrument detection task to improve its classification capacity. Our experimental results in both PSI-AVA and other publicly available databases demonstrate the adequacy of our framework to spur future research on holistic surgical scene understanding.
translated by 谷歌翻译
Modern deep neural networks tend to be evaluated on static test sets. One shortcoming of this is the fact that these deep neural networks cannot be easily evaluated for robustness issues with respect to specific scene variations. For example, it is hard to study the robustness of these networks to variations of object scale, object pose, scene lighting and 3D occlusions. The main reason is that collecting real datasets with fine-grained naturalistic variations of sufficient scale can be extremely time-consuming and expensive. In this work, we present Counterfactual Simulation Testing, a counterfactual framework that allows us to study the robustness of neural networks with respect to some of these naturalistic variations by building realistic synthetic scenes that allow us to ask counterfactual questions to the models, ultimately providing answers to questions such as "Would your classification still be correct if the object were viewed from the top?" or "Would your classification still be correct if the object were partially occluded by another object?". Our method allows for a fair comparison of the robustness of recently released, state-of-the-art Convolutional Neural Networks and Vision Transformers, with respect to these naturalistic variations. We find evidence that ConvNext is more robust to pose and scale variations than Swin, that ConvNext generalizes better to our simulated domain and that Swin handles partial occlusion better than ConvNext. We also find that robustness for all networks improves with network scale and with data scale and variety. We release the Naturalistic Variation Object Dataset (NVD), a large simulated dataset of 272k images of everyday objects with naturalistic variations such as object pose, scale, viewpoint, lighting and occlusions. Project page: https://counterfactualsimulation.github.io
translated by 谷歌翻译
科学机器学习(SCIML)是对几个不同应用领域的兴趣越来越多的领域。在优化上下文中,基于SCIML的工具使得能够开发更有效的优化方法。但是,必须谨慎评估和执行实施优化的SCIML工具。这项工作提出了稳健性测试的推论,该测试通过表明其结果尊重通用近似值定理,从而确保了基于多物理的基于SCIML的优化的鲁棒性。该测试应用于一种新方法的框架,该方法在一系列基准测试中进行了评估,以说明其一致性。此外,将提出的方法论结果与可行优化的可行区域进行了比较,这需要更高的计算工作。因此,这项工作为保证在多目标优化中应用SCIML工具的稳健性测试提供了比存在的替代方案要低的计算努力。
translated by 谷歌翻译
味道是遵循社会趋势和行为的风味行业的焦点。新调味剂和分子的研究和开发在该领域至关重要。另一方面,自然风味的发展在现代社会中起着至关重要的作用。鉴于此,目前的工作提出了一个基于科学机器学习的新颖框架,以在风味工程和行业中解决新的问题。因此,这项工作带来了一种创新的方法来设计新的自然风味分子。评估了有关合成可及性,原子数以及与天然或伪天然产物的相似性的分子。
translated by 谷歌翻译
最小的侵入性手术是高度操作员,依赖于冗长的程序时间,导致患者疲劳和风险。为了减轻这些风险,实时系统可以通过提供对场景的清晰了解并避免在操作过程中避免错误估计来帮助外科医生导航和跟踪工具。尽管已经朝这个方向做出了几项努力,但缺乏不同的数据集,并且非常动态的场景及其在每个患者中的可变性都需要实现强大的系统的重大障碍。在这项工作中,我们对最新基于机器学习的方法进行了系统评价,包括手术工具定位,细分,跟踪和3D场景感知。此外,我们提出了这些发明方法的当前差距和方向,并在这些方法的临床整合背后提供了合理的理性。
translated by 谷歌翻译