终身学习代理能够不断从潜在的图案感官数据流中学习。以这种方式适应的建筑物的一个主要历史困难是,在新样本中学习时,神经系统难以保留先前获得的知识。这个问题被称为灾难性忘记(干扰),并且在机器域中仍然是当天的机器域中的未解决问题。在几十年中,忘记了前馈网络的背景下,在诸如尊重的自组织地图(SOM)的替代架构中,在替代架构(SOM)的背景下,这是一个常用于任务的无监督的神经模型作为聚类和维度减少。虽然其内部神经元之间的竞争可能具有提高内存保留的可能性,但我们观察到在任务增量数据上培训的固定尺寸SOM培训,即,它以某些时间增量接收与特定类相关的数据点,经历重大遗忘。在这项研究中,我们提出了连续的SOM(C-SOM),一种能够在处理信息时减少自己遗忘的模型。
translated by 谷歌翻译
在不同的持续学习场景中可以经验经验评估模型的能力。每种情况都定义了限制和学习环境的机会。在这里,我们挑战了持续学习文学中的当前趋势,主要是在类渐进式场景上进行实验,其中一项经验中的课程从未被重新审视。我们对这种环境的过度注重可能是对持续学习的未来研究来限制,因为类增量场景人为地加剧了灾难性的遗忘,以牺牲其他重要目标等于前向传递和计算效率。在许多现实世界环境中,实际上,重复先前遇到的概念自然地发生,有助于软化对先前知识的破坏。我们倡导更深入地研究替代持续学习场景,其中重复通过传入信息流中的设计集成。从已经现有的提案开始,我们描述了这种级别的级别与重复方案的优势可以提供更全面的持续学习模型的评估。
translated by 谷歌翻译
在基于人工神经网络的终身学习系统中,最大的障碍之一是在遇到新信息时无法保留旧知识。这种现象被称为灾难性遗忘。在本文中,我们提出了一种新型的连接主义架构,即顺序的神经编码网络,在从数据点流中学习时忘记了,并且与当今的网络不同,它不会通过流行的错误反向传播来学习。基于预测性处理的神经认知理论,我们的模型以生物学上可行的方式适应了突触,而另一个神经系统学会了指导和控制这种类似皮层的结构,模仿了一些基础神经节的某些任务连续控制功能。在我们的实验中,我们证明了与标准神经模型相比,我们的自组织系统经历的遗忘大大降低,表现优于先前提出的方法,包括基于排练/数据缓冲的方法,包括标准(SplitMnist,SplitMnist,Split Mnist等) 。)和定制基准测试,即使以溪流式的方式进行了训练。我们的工作提供了证据表明,在实际神经元系统中模仿机制,例如本地学习,横向竞争,可以产生新的方向和可能性,以应对终身机器学习的巨大挑战。
translated by 谷歌翻译
为了在专门的神经形态硬件中进行节能计算,我们提出了尖峰神经编码,这是基于预测性编码理论的人工神经模型家族的实例化。该模型是同类模型,它是通过在“猜测和检查”的永无止境过程中运行的,神经元可以预测彼此的活动值,然后调整自己的活动以做出更好的未来预测。我们系统的互动性,迭代性质非常适合感官流预测的连续时间表述,并且如我们所示,模型的结构产生了局部突触更新规则,可以用来补充或作为在线峰值定位的替代方案依赖的可塑性。在本文中,我们对模型的实例化进行了实例化,该模型包括泄漏的集成和火灾单元。但是,我们系统所在的框架自然可以结合更复杂的神经元,例如Hodgkin-Huxley模型。我们在模式识别方面的实验结果证明了当二进制尖峰列车是通信间通信的主要范式时,模型的潜力。值得注意的是,尖峰神经编码在分类绩效方面具有竞争力,并且在从任务序列中学习时会降低遗忘,从而提供了更经济的,具有生物学上的替代品,可用于流行的人工神经网络。
translated by 谷歌翻译
由于灾难性的遗忘,计算系统的持续学习是挑战。我们在果蝇嗅觉系统中发现了两个层神经循环,通过独特地组合稀疏编码和关联学习来解决这一挑战。在第一层中,使用稀疏,高尺寸表示来编码气味,这通过激活非重叠神经元的神经元以进行不同气味来减少内存干扰。在第二层中,在学习期间仅修改异味活性神经元和与气味相关的输出神经元之间的突触;冻结其余重量以防止不相关的存储器被覆盖。我们经验和分析显示,这种简单轻型的算法显着提高了不断的学习性能。飞行关联学习算法与经典的Perceptron学习算法引人注目,尽管我们表现出两种修改对于减少灾难性遗忘至关重要。总体而言,果蝇演变了一种有效的终身学习算法,可以转换来自神经科学的电路机制以改善机器计算。
translated by 谷歌翻译
人类和其他动物的先天能力学习多样化,经常干扰,在整个寿命中的知识和技能范围是自然智能的标志,具有明显的进化动机。同时,人工神经网络(ANN)在一系列任务和域中学习的能力,组合和重新使用所需的学习表现,是人工智能的明确目标。这种能力被广泛描述为持续学习,已成为机器学习研究的多产子场。尽管近年来近年来深度学习的众多成功,但跨越域名从图像识别到机器翻译,因此这种持续的任务学习已经证明了具有挑战性的。在具有随机梯度下降的序列上训练的神经网络通常遭受代表性干扰,由此给定任务的学习权重有效地覆盖了在灾难性遗忘的过程中的先前任务的权重。这代表了对更广泛的人工学习系统发展的主要障碍,能够以类似于人类的方式积累时间和任务空间的知识。伴随的选定论文和实施存储库可以在https://github.com/mccaffary/continualualuallning找到。
translated by 谷歌翻译
Real-world applications often require learning continuously from a stream of data under ever-changing conditions. When trying to learn from such non-stationary data, deep neural networks (DNNs) undergo catastrophic forgetting of previously learned information. Among the common approaches to avoid catastrophic forgetting, rehearsal-based methods have proven effective. However, they are still prone to forgetting due to task-interference as all parameters respond to all tasks. To counter this, we take inspiration from sparse coding in the brain and introduce dynamic modularity and sparsity (Dynamos) for rehearsal-based general continual learning. In this setup, the DNN learns to respond to stimuli by activating relevant subsets of neurons. We demonstrate the effectiveness of Dynamos on multiple datasets under challenging continual learning evaluation protocols. Finally, we show that our method learns representations that are modular and specialized, while maintaining reusability by activating subsets of neurons with overlaps corresponding to the similarity of stimuli.
translated by 谷歌翻译
内存重播可能是在生物脑中学习的关键,这在没有灾难性地干扰以前的知识的情况下,必须不断地学习新任务。另一方面,人工神经网络遭受灾难性的遗忘,并且倾向于在最近训练的任务上表现出色。在这项工作中,我们使用人工神经网络探讨基于空间基于空间的内存重放的应用。我们能够通过在压缩潜在空间版本中仅存储一小部分原始数据来保持先前任务中的良好性能。
translated by 谷歌翻译
Online Class Incremental learning (CIL) is a challenging setting in Continual Learning (CL), wherein data of new tasks arrive in incoming streams and online learning models need to handle incoming data streams without revisiting previous ones. Existing works used a single centroid adapted with incoming data streams to characterize a class. This approach possibly exposes limitations when the incoming data stream of a class is naturally multimodal. To address this issue, in this work, we first propose an online mixture model learning approach based on nice properties of the mature optimal transport theory (OT-MM). Specifically, the centroids and covariance matrices of the mixture model are adapted incrementally according to incoming data streams. The advantages are two-fold: (i) we can characterize more accurately complex data streams and (ii) by using centroids for each class produced by OT-MM, we can estimate the similarity of an unseen example to each class more reasonably when doing inference. Moreover, to combat the catastrophic forgetting in the CIL scenario, we further propose Dynamic Preservation. Particularly, after performing the dynamic preservation technique across data streams, the latent representations of the classes in the old and new tasks become more condensed themselves and more separate from each other. Together with a contraction feature extractor, this technique facilitates the model in mitigating the catastrophic forgetting. The experimental results on real-world datasets show that our proposed method can significantly outperform the current state-of-the-art baselines.
translated by 谷歌翻译
人类在整个生命周期中不断学习,通过积累多样化的知识并为未来的任务进行微调。当出现类似目标时,神经网络会遭受灾难性忘记,在学习过程中跨顺序任务跨好任务的数据分布是否不固定。解决此类持续学习(CL)问题的有效方法是使用超网络为目标网络生成任务依赖权重。但是,现有基于超网的方法的持续学习性能受到整个层之间权重的独立性的假设,以维持参数效率。为了解决这一限制,我们提出了一种新颖的方法,该方法使用依赖关系保留超网络来为目标网络生成权重,同时还保持参数效率。我们建议使用基于复发的神经网络(RNN)的超网络,该网络可以有效地生成层权重,同时允许在它们的依赖关系中。此外,我们为基于RNN的超网络提出了新颖的正则化和网络增长技术,以进一步提高持续的学习绩效。为了证明所提出的方法的有效性,我们对几个图像分类持续学习任务和设置进行了实验。我们发现,基于RNN HyperNetworks的建议方法在所有这些CL设置和任务中都优于基准。
translated by 谷歌翻译
终身机器学习或持续学习模型试图通过在一系列任务中累积知识来逐步学习。因此,这些模型学会更好,更快。它们用于各种智能系统,这些系统必须与人类或任何动态环境互动,例如,聊天和自驾车。更少的内存方法更常用于深度神经网络,该网络可容纳从其体系结构内的任务中的传入信息。它允许他们在所有已见的任务中表现良好。这些模型患有语义漂移或可塑性稳定性困境。现有模型使用Minkowski距离措施来确定要冻结,更新或重复的哪些节点。这些距离度量不提供更好的节点分离,因为它们易受高维稀疏向量。在我们提出的方法中,我们使用角距离来评估提供更好地分离节点的个体节点中的语义漂移,从而在稳定性和可塑性之间更好地平衡。所提出的方法通过在标准数据集上保持更高的准确性来实现最先进的模型。
translated by 谷歌翻译
随着智能代理在更长的时间内变得自主,他们最终可能会成为特定人的终身对应者。如果是这样,用户可能希望代理商暂时掌握任务,但后来由于隐私问题而忘记了任务。但是,使代理到\ emph {忘记}用户在不降低其余知识的情况下指定的内容是一个具有挑战性的问题。为了应对这一挑战,本文正式将这种持续学习和私人学习(CLPU)问题形式化。该论文进一步引入了一个直接但完全私有的解决方案Clpu-der ++,作为解决CLPU问题的第一步,以及一组精心设计的基准问题,以评估所提出的解决方案的有效性。该代码可在https://github.com/cranial-xix/continual-learning-private-unlearning上找到。
translated by 谷歌翻译
We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems. Instead of tackling continual learning via the use of external memory, growing models, or regularization, EBMs change the underlying training objective to cause less interference with previously learned information. Our proposed version of EBMs for continual learning is simple, efficient, and outperforms baseline methods by a large margin on several benchmarks. Moreover, our proposed contrastive divergence-based training objective can be combined with other continual learning methods, resulting in substantial boosts in their performance. We further show that EBMs are adaptable to a more general continual learning setting where the data distribution changes without the notion of explicitly delineated tasks. These observations point towards EBMs as a useful building block for future continual learning methods.
translated by 谷歌翻译
AI的一个关键挑战是构建体现的系统,该系统在动态变化的环境中运行。此类系统必须适应更改任务上下文并持续学习。虽然标准的深度学习系统实现了最先进的静态基准的结果,但它们通常在动态方案中挣扎。在这些设置中,来自多个上下文的错误信号可能会彼此干扰,最终导致称为灾难性遗忘的现象。在本文中,我们将生物学启发的架构调查为对这些问题的解决方案。具体而言,我们表明树突和局部抑制系统的生物物理特性使网络能够以特定于上下文的方式动态限制和路由信息。我们的主要贡献如下。首先,我们提出了一种新颖的人工神经网络架构,该架构将活跃的枝形和稀疏表示融入了标准的深度学习框架中。接下来,我们在需要任务的适应性的两个单独的基准上研究这种架构的性能:Meta-World,一个机器人代理必须学习同时解决各种操纵任务的多任务强化学习环境;和一个持续的学习基准,其中模型的预测任务在整个训练中都会发生变化。对两个基准的分析演示了重叠但不同和稀疏的子网的出现,允许系统流动地使用最小的遗忘。我们的神经实现标志在单一架构上第一次在多任务和持续学习设置上取得了竞争力。我们的研究揭示了神经元的生物学特性如何通知深度学习系统,以解决通常不可能对传统ANN来解决的动态情景。
translated by 谷歌翻译
Continual Learning (CL) is a field dedicated to devise algorithms able to achieve lifelong learning. Overcoming the knowledge disruption of previously acquired concepts, a drawback affecting deep learning models and that goes by the name of catastrophic forgetting, is a hard challenge. Currently, deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions, but whenever we expose such systems to this incremental setting, performance drop very quickly. Overcoming this limitation is fundamental as it would allow us to build truly intelligent systems showing stability and plasticity. Secondly, it would allow us to overcome the onerous limitation of retraining these architectures from scratch with the new updated data. In this thesis, we tackle the problem from multiple directions. In a first study, we show that in rehearsal-based techniques (systems that use memory buffer), the quantity of data stored in the rehearsal buffer is a more important factor over the quality of the data. Secondly, we propose one of the early works of incremental learning on ViTs architectures, comparing functional, weight and attention regularization approaches and propose effective novel a novel asymmetric loss. At the end we conclude with a study on pretraining and how it affects the performance in Continual Learning, raising some questions about the effective progression of the field. We then conclude with some future directions and closing remarks.
translated by 谷歌翻译
Many modern computer vision algorithms suffer from two major bottlenecks: scarcity of data and learning new tasks incrementally. While training the model with new batches of data the model looses it's ability to classify the previous data judiciously which is termed as catastrophic forgetting. Conventional methods have tried to mitigate catastrophic forgetting of the previously learned data while the training at the current session has been compromised. The state-of-the-art generative replay based approaches use complicated structures such as generative adversarial network (GAN) to deal with catastrophic forgetting. Additionally, training a GAN with few samples may lead to instability. In this work, we present a novel method to deal with these two major hurdles. Our method identifies a better embedding space with an improved contrasting loss to make classification more robust. Moreover, our approach is able to retain previously acquired knowledge in the embedding space even when trained with new classes. We update previous session class prototypes while training in such a way that it is able to represent the true class mean. This is of prime importance as our classification rule is based on the nearest class mean classification strategy. We have demonstrated our results by showing that the embedding space remains intact after training the model with new classes. We showed that our method preformed better than the existing state-of-the-art algorithms in terms of accuracy across different sessions.
translated by 谷歌翻译
在本报告中,我们考虑以下问题:给定一个训练有素的模型,我们可以纠正其行为而无需从头开始训练模型吗?换句话说,我们可以``调试''神经网络类似于我们如何解决数学模型和标准计算机代码中的错误。我们基于一个假设,即调试可以被视为两任任务的连续学习问题。特别是。,我们采用了一种称为正交梯度下降(OGD)的持续学习算法的修改版本,通过MNIST数据集中的两个简单实验来证明我们可以在不理解的行为中进行实际的\ textit {unterarn},同时保持不良行为。该模型,我们可以另外可以\ textit {Rerearnn}适当的行为,而无需从头开始训练模型。
translated by 谷歌翻译
机器学习的一个显着缺点是模型能够更快地解决新问题,而不会忘记获得的知识。为了更好地理解这个问题,已经出现了持续的学习来系统地调查学习协议,其中模型顺序地观察由一系列任务产生的样本。首先,我们提出了一种促进学习和遗忘之间进行权衡的最优性原则。我们从有界合理性的信息化学制定中获得了这一原则,并显示了与其他连续学习方法的联系。其次,基于这一原则,我们提出了一种神经网络层,用于持续学习,称为变分的专家(移动),缓解遗忘,同时使知识有益转移到新任务。我们对MNIST和CIFAR10数据集的变型的实验表明,与最先进的方法相比,移动层的竞争性能。
translated by 谷歌翻译
古典机器学习者仅设计用于解决一项任务,而无需采用新的新兴任务或课程,而这种能力在现实世界中更实用和人类。为了解决这种缺点,阐述了持续的机器学习者,以表彰使用域和班级的任务流,不同的任务之间的转变。在本文中,我们提出了一种基于一个基于对比的连续学习方法,其能够处理多个持续学习场景。具体地,我们通过特征传播和对比表示学习来对准当前和先前的表示空间来弥合不同任务之间的域移位。为了进一步减轻特征表示的类别的班次,利用了监督的对比损失以使与不同类别的相同类的示例嵌入。广泛的实验结果表明,与一组尖端连续学习方法相比,六个连续学习基准中提出的方法的出色性能。
translated by 谷歌翻译
The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.
translated by 谷歌翻译