尖峰神经网络(SNNS)模仿大脑计算策略,并在时空信息处理中表现出很大的功能。作为人类感知的基本因素,视觉关注是指生物视觉系统中显着区域的动态选择过程。尽管视觉注意力的机制在计算机视觉上取得了巨大成功,但很少会引入SNN中。受到预测注意重新映射的实验观察的启发,我们在这里提出了一种新的时空通道拟合注意力(SCTFA)模块,该模块可以通过使用历史积累的空间通道信息来指导SNN有效地捕获潜在的目标区域。通过在三个事件流数据集(DVS手势,SL-Animals-DVS和MNIST-DVS)上进行系统评估,我们证明了带有SCTFA模块(SCTFA-SNN)的SNN不仅显着超过了基线SNN(BL-SNN)(BL-SNN)(BL-SNN)以及其他两个具有退化注意力模块的SNN模型,但也通过现有最新方法实现了竞争精度。此外,我们的详细分析表明,所提出的SCTFA-SNN模型对噪声和出色的稳定性具有强大的稳健性,同时保持了可接受的复杂性和效率。总体而言,这些发现表明,适当纳入大脑的认知机制可能会提供一种有希望的方法来提高SNN的能力。
translated by 谷歌翻译
尖峰神经网络(SNN)在各种智能场景中都表现出了出色的功能。大多数现有的训练SNN方法基于突触可塑性的概念。但是,在现实的大脑中学习还利用了神经元的内在非突触机制。生物神经元的尖峰阈值是一种关键的固有神经元特征,在毫秒的时间尺度上表现出丰富的动力学,并已被认为是一种促进神经信息处理的基本机制。在这项研究中,我们开发了一种新型的协同学习方法,该方法同时训练SNN中的突触权重和尖峰阈值。经过突触阈值协同学习(STL-SNN)训练的SNN在各种静态和神经形态数据集上的精度明显高于接受两种突触学习(SL)和阈值学习(TL)的单独学习模型(TL)的SNN。在训练过程中,协同学习方法优化了神经阈值,通过适当的触发速率为网络提供稳定的信号传输。进一步的分析表明,STL-SNN对嘈杂的数据是可靠的,并且对深网结构表现出低的能耗。此外,通过引入广义联合决策框架(JDF),可以进一步提高STL-SNN的性能。总体而言,我们的发现表明,突触和内在的非突触机制之间的生物学上合理的协同作用可能为开发高效的SNN学习方法提供了一种有希望的方法。
translated by 谷歌翻译
最近联合学习(FL)范式的潜在假设是本地模型通常与全局模型共享与全局模型相同的网络架构,这对于具有不同的硬件和基础架构的移动和IOT设备变得不切实际。可扩展的联合学习框架应该解决配备不同计算和通信功能的异构客户端。为此,本文提出了一种新的联合模型压缩框架,它将异构低级模型分配给客户端,然后将它们聚合到全局全级模型中。我们的解决方案使得能够培训具有不同计算复杂性的异构本地模型,并汇总单个全局模型。此外,FEDHM不仅降低了设备的计算复杂性,而且还通过使用低秩模型来降低通信成本。广泛的实验结果表明,我们提出的\ System在测试顶-1精度(平均精度4.6%的精度增益)方面优于现行修剪的液体方法,在各种异构流域下较小的型号尺寸(平均较小为1.5倍) 。
translated by 谷歌翻译
在多目标优化中,一组具有各种功能的可扩展测试问题使研究人员可以调查和评估不同优化算法的能力,因此可以帮助他们设计和开发更有效,更有效的方法。现有的测试问题套件主要集中在所有目标彼此完全冲突的情况下。在这种情况下,目标空间中的M-Obigntive优化问题具有(M-1)维帕累托前沿。但是,在某些优化问题中,目标之间可能存在意外的特征,例如冗余。某些目标的冗余可能会导致具有堕落的帕累托正面的多物镜问题,即,$ m $ - 目标问题的帕累托正面的尺寸小于(M-1)。在本文中,我们系统地研究了退化的多目标问题。我们抽象了退化问题的三个一般特征,这些特征未在文献中进行制定和系统地研究。基于这些特征,我们提出了一组测试问题,以支持在具有冗余目标的情况下对多目标优化算法进行研究。据我们所知,这项工作是第一项明确提出退化问题的三个特征,从而使所得的测试问题的一般性具有一般性的特征,与为特定目的设计的现有测试问题相比(例如,可视化),则允许所得的测试问题。 )。
translated by 谷歌翻译
Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
translated by 谷歌翻译
Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.
translated by 谷歌翻译
Normalizing flow is a class of deep generative models for efficient sampling and density estimation. In practice, the flow often appears as a chain of invertible neural network blocks; to facilitate training, existing works have regularized flow trajectories and designed special network architectures. The current paper develops a neural ODE flow network inspired by the Jordan-Kinderleherer-Otto (JKO) scheme, which allows efficient block-wise training of the residual blocks and avoids inner loops of score matching or variational learning. As the JKO scheme unfolds the dynamic of gradient flow, the proposed model naturally stacks residual network blocks one-by-one, reducing the memory load and difficulty of performing end-to-end training of deep flow networks. We also develop adaptive time reparameterization of the flow network with a progressive refinement of the trajectory in probability space, which improves the model training efficiency and accuracy in practice. Using numerical experiments with synthetic and real data, we show that the proposed JKO-iFlow model achieves similar or better performance in generating new samples compared with existing flow and diffusion models at a significantly reduced computational and memory cost.
translated by 谷歌翻译
Score-based diffusion models have captured widespread attention and funded fast progress of recent vision generative tasks. In this paper, we focus on diffusion model backbone which has been much neglected before. We systematically explore vision Transformers as diffusion learners for various generative tasks. With our improvements the performance of vanilla ViT-based backbone (IU-ViT) is boosted to be on par with traditional U-Net-based methods. We further provide a hypothesis on the implication of disentangling the generative backbone as an encoder-decoder structure and show proof-of-concept experiments verifying the effectiveness of a stronger encoder for generative tasks with ASymmetriC ENcoder Decoder (ASCEND). Our improvements achieve competitive results on CIFAR-10, CelebA, LSUN, CUB Bird and large-resolution text-to-image tasks. To the best of our knowledge, we are the first to successfully train a single diffusion model on text-to-image task beyond 64x64 resolution. We hope this will motivate people to rethink the modeling choices and the training pipelines for diffusion-based generative models.
translated by 谷歌翻译
This paper studies the distribution estimation of contaminated data by the MoM-GAN method, which combines generative adversarial net (GAN) and median-of-mean (MoM) estimation. We use a deep neural network (DNN) with a ReLU activation function to model the generator and discriminator of the GAN. Theoretically, we derive a non-asymptotic error bound for the DNN-based MoM-GAN estimator measured by integral probability metrics with the $b$-smoothness H\"{o}lder class. The error bound decreases essentially as $n^{-b/p}\vee n^{-1/2}$, where $n$ and $p$ are the sample size and the dimension of input data. We give an algorithm for the MoM-GAN method and implement it through two real applications. The numerical results show that the MoM-GAN outperforms other competitive methods when dealing with contaminated data.
translated by 谷歌翻译
Currently, most deep learning methods cannot solve the problem of scarcity of industrial product defect samples and significant differences in characteristics. This paper proposes an unsupervised defect detection algorithm based on a reconstruction network, which is realized using only a large number of easily obtained defect-free sample data. The network includes two parts: image reconstruction and surface defect area detection. The reconstruction network is designed through a fully convolutional autoencoder with a lightweight structure. Only a small number of normal samples are used for training so that the reconstruction network can be A defect-free reconstructed image is generated. A function combining structural loss and $\mathit{L}1$ loss is proposed as the loss function of the reconstruction network to solve the problem of poor detection of irregular texture surface defects. Further, the residual of the reconstructed image and the image to be tested is used as the possible region of the defect, and conventional image operations can realize the location of the fault. The unsupervised defect detection algorithm of the proposed reconstruction network is used on multiple defect image sample sets. Compared with other similar algorithms, the results show that the unsupervised defect detection algorithm of the reconstructed network has strong robustness and accuracy.
translated by 谷歌翻译