我们提出了一个食谱,讲述了如何建立具有线性复杂性和最先进的结果的一般,功能可扩展的(GPS)图形变压器,并在各种基准测试基准上。 Graph Transformers(GTS)在图形表示学习领域中获得了多种近期出版物的知名度,但它们对构成良好的位置或结构编码的共同基础以及与众不同的区别。在本文中,我们总结了具有更清晰的定义的不同类型的编码,并将其分类为$ \ textit {local} $,$ \ textit {global} $或$ \ textit {fextit {ferseal} $。此外,GTS仍被限制在具有数百个节点的小图上,我们提出了第一个具有复杂性线性的体系结构对节点和边缘$ O(n+e)$的数量,通过将局部实质汇总从完全 - 连接的变压器。我们认为,这种解耦并不会对表现性产生负面影响,而我们的体系结构是图形的通用函数近似器。我们的GPS配方包括选择3种主要成分:(i)位置/结构编码,(ii)局部消息通讯机制和(iii)全局注意机制。我们构建和开源一个模块化框架$ \ textit {graphgps} $,该{GraphGps} $支持多种类型的编码,并且在小图和大图中提供效率和可扩展性。我们在11个基准测试上测试了我们的体系结构,并对所有这些基准显示出非常具竞争力的结果,展示了由模块化和不同策略组合获得的经验益处。
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown great potential in the field of graph representation learning. Standard GNNs define a local message-passing mechanism which propagates information over the whole graph domain by stacking multiple layers. This paradigm suffers from two major limitations, over-squashing and poor long-range dependencies, that can be solved using global attention but significantly increases the computational cost to quadratic complexity. In this work, we propose an alternative approach to overcome these structural limitations by leveraging the ViT/MLP-Mixer architectures introduced in computer vision. We introduce a new class of GNNs, called Graph MLP-Mixer, that holds three key properties. First, they capture long-range dependency and mitigate the issue of over-squashing as demonstrated on the Long Range Graph Benchmark (LRGB) and the TreeNeighbourMatch datasets. Second, they offer better speed and memory efficiency with a complexity linear to the number of nodes and edges, surpassing the related Graph Transformer and expressive GNN models. Third, they show high expressivity in terms of graph isomorphism as they can distinguish at least 3-WL non-isomorphic graphs. We test our architecture on 4 simulated datasets and 7 real-world benchmarks, and show highly competitive results on all of them.
translated by 谷歌翻译
基于1-HOP邻居之间的消息传递(MP)范式交换信息的图形神经网络(GNN),以在每一层构建节点表示。原则上,此类网络无法捕获在图形上学习给定任务的可能或必需的远程交互(LRI)。最近,人们对基于变压器的图的开发产生了越来越多的兴趣,这些方法可以考虑超出原始稀疏结构以外的完整节点连接,从而实现了LRI的建模。但是,仅依靠1跳消息传递的MP-gnn与位置特征表示形式结合使用时通常在几个现有的图形基准中表现得更好,因此,限制了Transferter类似体系结构的感知效用和排名。在这里,我们介绍了5个图形学习数据集的远程图基准(LRGB):Pascalvoc-SP,Coco-SP,PCQM-Contact,Peptides-Func和肽结构,可以说需要LRI推理以在给定的任务中实现强大的性能。我们基准测试基线GNN和Graph Transformer网络,以验证捕获长期依赖性的模型在这些任务上的性能明显更好。因此,这些数据集适用于旨在捕获LRI的MP-GNN和Graph Transformer架构的基准测试和探索。
translated by 谷歌翻译
In the last few years, graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. This emerging field has witnessed an extensive growth of promising techniques that have been applied with success to computer science, mathematics, biology, physics and chemistry. But for any successful field to become mainstream and reliable, benchmarks must be developed to quantify progress. This led us in March 2020 to release a benchmark framework that i) comprises of a diverse collection of mathematical and real-world graphs, ii) enables fair model comparison with the same parameter budget to identify key architectures, iii) has an open-source, easy-to-use and reproducible code infrastructure, and iv) is flexible for researchers to experiment with new theoretical ideas. As of December 2022, the GitHub repository has reached 2,000 stars and 380 forks, which demonstrates the utility of the proposed open-source framework through the wide usage by the GNN community. In this paper, we present an updated version of our benchmark with a concise presentation of the aforementioned framework characteristics, an additional medium-sized molecular dataset AQSOL, similar to the popular ZINC, but with a real-world measured chemical target, and discuss how this framework can be leveraged to explore new GNN designs and insights. As a proof of value of our benchmark, we study the case of graph positional encoding (PE) in GNNs, which was introduced with this benchmark and has since spurred interest of exploring more powerful PE for Transformers and GNNs in a robust experimental setting.
translated by 谷歌翻译
变压器架构最近在图表表示学习中引起了人们的注意,因为它自然地克服了图神经网络(GNN)的几个局限性,避免了它们严格的结构电感偏置,而仅通过位置编码来编码图形结构。在这里,我们表明,具有位置编码的变压器生成的节点表示不一定捕获它们之间的结构相似性。为了解决这个问题,我们提出了结构感知的变压器,这是一类简单而灵活的图形变压器,建立在新的自我发项机制的基础上。这一新的自我注意力通过在计算注意力之前提取植根于每个节点的子图表来结合结构信息。我们提出了几种自动生成子图表表示的方法,并从理论上说明结果表示至少与子图表一样表现力。从经验上讲,我们的方法在五个图预测基准上实现了最先进的性能。我们的结构感知框架可以利用任何现有的GNN提取子图表表示,我们表明它系统地改善了相对于基本GNN模型的性能,成功地结合了GNN和变形金刚的优势。我们的代码可在https://github.com/borgwardtlab/sat上找到。
translated by 谷歌翻译
变压器架构已成为许多域中的主导选择,例如自然语言处理和计算机视觉。然而,与主流GNN变体相比,它对图形水平预测的流行排行榜没有竞争表现。因此,它仍然是一个谜,变形金机如何对图形表示学习表现良好。在本文中,我们通过提出了基于标准变压器架构构建的Gragemer来解决这一神秘性,并且可以在广泛的图形表示学习任务中获得优异的结果,特别是在最近的OGB大规模挑战上。我们在图中利用变压器的关键洞察是有效地将图形的结构信息有效地编码到模型中。为此,我们提出了几种简单但有效的结构编码方法,以帮助Gramemormer更好的模型图形结构数据。此外,我们在数学上表征了Gramemormer的表现力,并展示了我们编码图形结构信息的方式,许多流行的GNN变体都可以被涵盖为GrameRormer的特殊情况。
translated by 谷歌翻译
图形神经网络(GNNS)通过考虑其内在的几何形状来扩展神经网络的成功到图形结构化数据。尽管根据图表学习基准的集合,已经对开发具有卓越性能的GNN模型进行了广泛的研究,但目前尚不清楚其探测给定模型的哪些方面。例如,他们在多大程度上测试模型利用图形结构与节点特征的能力?在这里,我们开发了一种原则性的方法来根据$ \ textit {敏感性配置文件} $进行基准测试数据集,该方法基于由于图形扰动的集合而导致的GNN性能变化了多少。我们的数据驱动分析提供了对GNN利用哪些基准测试数据特性的更深入的了解。因此,我们的分类法可以帮助选择和开发适当的图基准测试,并更好地评估未来的GNN方法。最后,我们在$ \ texttt {gtaxogym} $软件包中的方法和实现可扩展到多个图形预测任务类型和未来数据集。
translated by 谷歌翻译
The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model graph-structured data. Besides, we mathematically characterize the expressive power of Graphormer and exhibit that with our ways of encoding the structural information of graphs, many popular GNN variants could be covered as the special cases of Graphormer. The code and models of Graphormer will be made publicly available at https://github.com/Microsoft/Graphormer.
translated by 谷歌翻译
Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.
translated by 谷歌翻译
A current goal in the graph neural network literature is to enable transformers to operate on graph-structured data, given their success on language and vision tasks. Since the transformer's original sinusoidal positional encodings (PEs) are not applicable to graphs, recent work has focused on developing graph PEs, rooted in spectral graph theory or various spatial features of a graph. In this work, we introduce a new graph PE, Graph Automaton PE (GAPE), based on weighted graph-walking automata (a novel extension of graph-walking automata). We compare the performance of GAPE with other PE schemes on both machine translation and graph-structured tasks, and we show that it generalizes several other PEs. An additional contribution of this study is a theoretical and controlled experimental comparison of many recent PEs in graph transformers, independent of the use of edge features.
translated by 谷歌翻译
图形神经网络(GNN)已成为一种学习关系数据的强大技术。由于他们执行的消息传递步骤数量相对有限 - 因此一个较小的接收领域,人们对通过结合基础图的结构方面来提高其表现力引起了极大的兴趣。在本文中,我们探讨了亲和力措施作为图形神经网络中的特征,特别是由随机步行引起的措施,包括有效的阻力,击球和通勤时间。我们根据这些功能提出消息传递网络,并评估其在各种节点和图形属性预测任务上的性能。我们的体系结构具有较低的计算复杂性,而我们的功能对于基础图的排列不变。我们计算的措施使网络可以利用图表的连接性能,从而使我们能够超过相关的基准,用于各种任务,通常具有更少的消息传递步骤。在OGB-LSC-PCQM4MV1的最大公共图形回归数据集之一中,我们在编写时获得了最著名的单模验证MAE。
translated by 谷歌翻译
This technical report presents GPS++, the first-place solution to the Open Graph Benchmark Large-Scale Challenge (OGB-LSC 2022) for the PCQM4Mv2 molecular property prediction task. Our approach implements several key principles from the prior literature. At its core our GPS++ method is a hybrid MPNN/Transformer model that incorporates 3D atom positions and an auxiliary denoising task. The effectiveness of GPS++ is demonstrated by achieving 0.0719 mean absolute error on the independent test-challenge PCQM4Mv2 split. Thanks to Graphcore IPU acceleration, GPS++ scales to deep architectures (16 layers), training at 3 minutes per epoch, and large ensemble (112 models), completing the final predictions in 1 hour 32 minutes, well under the 4 hour inference budget allocated. Our implementation is publicly available at: https://github.com/graphcore/ogb-lsc-pcqm4mv2.
translated by 谷歌翻译
Most graph neural network models rely on a particular message passing paradigm, where the idea is to iteratively propagate node representations of a graph to each node in the direct neighborhood. While very prominent, this paradigm leads to information propagation bottlenecks, as information is repeatedly compressed at intermediary node representations, which causes loss of information, making it practically impossible to gather meaningful signals from distant nodes. To address this issue, we propose shortest path message passing neural networks, where the node representations of a graph are propagated to each node in the shortest path neighborhoods. In this setting, nodes can directly communicate between each other even if they are not neighbors, breaking the information bottleneck and hence leading to more adequately learned representations. Theoretically, our framework generalizes message passing neural networks, resulting in provably more expressive models, and we show that some recent state-of-the-art models are special instances of this framework. Empirically, we verify the capacity of a basic model of this framework on dedicated synthetic experiments, and on real-world graph classification and regression benchmarks, and obtain state-of-the-art results.
translated by 谷歌翻译
图形神经网络(GNNS)依赖于图形结构来定义聚合策略,其中每个节点通过与邻居的信息组合来更新其表示。已知GNN的限制是,随着层数的增加,信息被平滑,压扁并且节点嵌入式变得无法区分,对性能产生负面影响。因此,实用的GNN模型雇用了几层,只能在每个节点周围的有限邻域利用图形结构。不可避免地,实际的GNN不会根据图的全局结构捕获信息。虽然有几种研究GNNS的局限性和表达性,但是关于图形结构数据的实际应用的问题需要全局结构知识,仍然没有答案。在这项工作中,我们通过向几个GNN模型提供全球信息并观察其对下游性能的影响来认证解决这个问题。我们的研究结果表明,全球信息实际上可以为共同的图形相关任务提供显着的好处。我们进一步确定了一项新的正规化策略,导致所有考虑的任务的平均准确性提高超过5%。
translated by 谷歌翻译
近年来,基于Weisfeiler-Leman算法的算法和神经架构,是一个众所周知的Graph同构问题的启发式问题,它成为具有图形和关系数据的机器学习的强大工具。在这里,我们全面概述了机器学习设置中的算法的使用,专注于监督的制度。我们讨论了理论背景,展示了如何将其用于监督的图形和节点表示学习,讨论最近的扩展,并概述算法的连接(置换 - )方面的神经结构。此外,我们概述了当前的应用和未来方向,以刺激进一步的研究。
translated by 谷歌翻译
最新提出的基于变压器的图形模型的作品证明了香草变压器用于图形表示学习的不足。要了解这种不足,需要研究变压器的光谱分析是否会揭示其对其表现力的见解。类似的研究已经确定,图神经网络(GNN)的光谱分析为其表现力提供了额外的观点。在这项工作中,我们系统地研究并建立了变压器领域中的空间和光谱域之间的联系。我们进一步提供了理论分析,并证明了变压器中的空间注意机制无法有效捕获所需的频率响应,因此,固有地限制了其在光谱空间中的表现力。因此,我们提出了feta,该框架旨在在整个图形频谱(即图形的实际频率成分)上进行注意力类似于空间空间中的注意力。经验结果表明,FETA在标准基准的所有任务中为香草变压器提供均匀的性能增益,并且可以轻松地扩展到具有低通特性的基于GNN的模型(例如GAT)。
translated by 谷歌翻译
图形神经网络(GNNS)的表现力量受到限制,具有远程交互的斗争,缺乏模拟高阶结构的原则性方法。这些问题可以归因于计算图表和输入图结构之间的强耦合。最近提出的消息通过单独的网络通过执行图形的Clique复合物的消息来自然地解耦这些元素。然而,这些模型可能受到单纯复合物(SCS)的刚性组合结构的严重限制。在这项工作中,我们将最近的基于常规细胞复合物的理论结果扩展到常规细胞复合物,灵活地满满SCS和图表的拓扑物体。我们表明,该概括提供了一组强大的图表“提升”转换,每个图形是导致唯一的分层消息传递过程。我们集体呼叫CW Networks(CWNS)的结果方法比WL测试更强大,而不是比3 WL测试更强大。特别是,当应用于分子图问题时,我们证明了一种基于环的一个这样的方案的有效性。所提出的架构从可提供的较大的表达效益于常用的GNN,高阶信号的原则建模以及压缩节点之间的距离。我们展示了我们的模型在各种分子数据集上实现了最先进的结果。
translated by 谷歌翻译
Pre-publication draft of a book to be published byMorgan & Claypool publishers. Unedited version released with permission. All relevant copyrights held by the author and publisher extend to this pre-publication draft.
translated by 谷歌翻译
消息传递神经网络(MPNNS)是由于其简单性和可扩展性而大部分地进行图形结构数据的深度学习的领先架构。不幸的是,有人认为这些架构的表现力有限。本文提出了一种名为Comifariant Subgraph聚合网络(ESAN)的新颖框架来解决这个问题。我们的主要观察是,虽然两个图可能无法通过MPNN可区分,但它们通常包含可区分的子图。因此,我们建议将每个图形作为由某些预定义策略导出的一组子图,并使用合适的等分性架构来处理它。我们为图同构同构同构造的1立维Weisfeiler-Leman(1-WL)测试的新型变体,并在这些新的WL变体方面证明了ESAN的表达性下限。我们进一步证明,我们的方法增加了MPNNS和更具表现力的架构的表现力。此外,我们提供了理论结果,描述了设计选择诸如子图选择政策和等效性神经结构的设计方式如何影响我们的架构的表现力。要处理增加的计算成本,我们提出了一种子图采样方案,可以将其视为我们框架的随机版本。关于真实和合成数据集的一套全面的实验表明,我们的框架提高了流行的GNN架构的表现力和整体性能。
translated by 谷歌翻译
在过去十年中,图形内核引起了很多关注,并在结构化数据上发展成为一种快速发展的学习分支。在过去的20年中,该领域发生的相当大的研究活动导致开发数十个图形内核,每个图形内核都对焦于图形的特定结构性质。图形内核已成功地成功地在广泛的域中,从社交网络到生物信息学。本调查的目标是提供图形内核的文献的统一视图。特别是,我们概述了各种图形内核。此外,我们对公共数据集的几个内核进行了实验评估,并提供了比较研究。最后,我们讨论图形内核的关键应用,并概述了一些仍有待解决的挑战。
translated by 谷歌翻译