本文介绍了一种拟合有限维欧几里得空间的沉浸式亚策略的方法。从环境空间到所需的子策略的重建映射是作为编码器的组成而实现的固定初始点。编码器为流量提供时间。编码器二进制图是通过经验风险最小化获得的,并且相对于给定的Encoder-Decoder映射的最小预期重建误差,多余的风险给出了高概率。拟议的方法是对苏斯曼的轨道定理的基本使用,该定理保证了重建图的图像确实包含在沉浸式的子手机中。
translated by 谷歌翻译
我们考虑以下学习问题:给定由未知非线性系统生成的输入和输出信号对(哪个未被假定是因果或时间不变),我们希望找到具有双曲线切线激活的连续反复性神经网络函数大致再现底层的I / O行为,高度置信度。利用较早的工作与匹配的输出衍生品达到给定的有限顺序,我们以熟悉的系统理论语言重构学习问题,并导出了在学习模型的超标范围内的超标范围的定量保证,样本大小,匹配的衍生数的数量,以及输入,输出和未知I / O映射的规律性属性。
translated by 谷歌翻译
With an ever-growing number of parameters defining increasingly complex networks, Deep Learning has led to several breakthroughs surpassing human performance. As a result, data movement for these millions of model parameters causes a growing imbalance known as the memory wall. Neuromorphic computing is an emerging paradigm that confronts this imbalance by performing computations directly in analog memories. On the software side, the sequential Backpropagation algorithm prevents efficient parallelization and thus fast convergence. A novel method, Direct Feedback Alignment, resolves inherent layer dependencies by directly passing the error from the output to each layer. At the intersection of hardware/software co-design, there is a demand for developing algorithms that are tolerable to hardware nonidealities. Therefore, this work explores the interrelationship of implementing bio-plausible learning in-situ on neuromorphic hardware, emphasizing energy, area, and latency constraints. Using the benchmarking framework DNN+NeuroSim, we investigate the impact of hardware nonidealities and quantization on algorithm performance, as well as how network topologies and algorithm-level design choices can scale latency, energy and area consumption of a chip. To the best of our knowledge, this work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa. The best results achieved for accuracy remain Backpropagation-based, notably when facing hardware imperfections. Direct Feedback Alignment, on the other hand, allows for significant speedup due to parallelization, reducing training time by a factor approaching N for N-layered networks.
translated by 谷歌翻译
Understanding the relationship between structure and sentiment is essential in highlighting future operations with online social networks. More specifically, within popular conversation on Twitter. This paper provides a development on the relationship between the two variables: structure, defined as the composition of a directed network, and sentiment, a quantified value of the positive/negative connotations of a conversation. We highlight thread sentiment to be inversely proportional to the strength and connectivity of a network. The second portion of this paper highlights differences in query types, specifically how the aforementioned behavior differs within four key query types. This paper focuses on topical, event-based, geographic, and individual queries as orientations which have differing behavior. Using cross-query analysis, we see that the relationship between structure and sentiment, though still inversely proportional, differs greatly across query types. We find this relationship to be the most clear within the individual queries and the least prevalent within the event-based queries. This paper provides a sociological progression in our understanding of opinion and networks, while providing a methodological advancement for future studies on similar subjects.
translated by 谷歌翻译
We present temporally layered architecture (TLA), a biologically inspired system for temporally adaptive distributed control. TLA layers a fast and a slow controller together to achieve temporal abstraction that allows each layer to focus on a different time-scale. Our design is biologically inspired and draws on the architecture of the human brain which executes actions at different timescales depending on the environment's demands. Such distributed control design is widespread across biological systems because it increases survivability and accuracy in certain and uncertain environments. We demonstrate that TLA can provide many advantages over existing approaches, including persistent exploration, adaptive control, explainable temporal behavior, compute efficiency and distributed control. We present two different algorithms for training TLA: (a) Closed-loop control, where the fast controller is trained over a pre-trained slow controller, allowing better exploration for the fast controller and closed-loop control where the fast controller decides whether to "act-or-not" at each timestep; and (b) Partially open loop control, where the slow controller is trained over a pre-trained fast controller, allowing for open loop-control where the slow controller picks a temporally extended action or defers the next n-actions to the fast controller. We evaluated our method on a suite of continuous control tasks and demonstrate the advantages of TLA over several strong baselines.
translated by 谷歌翻译
Data deprivation, or the lack of easily available and actionable information on the well-being of individuals, is a significant challenge for the developing world and an impediment to the design and operationalization of policies intended to alleviate poverty. In this paper we explore the suitability of data derived from OpenStreetMap to proxy for the location of two crucial public services: schools and health clinics. Thanks to the efforts of thousands of digital humanitarians, online mapping repositories such as OpenStreetMap contain millions of records on buildings and other structures, delineating both their location and often their use. Unfortunately much of this data is locked in complex, unstructured text rendering it seemingly unsuitable for classifying schools or clinics. We apply a scalable, unsupervised learning method to unlabeled OpenStreetMap building data to extract the location of schools and health clinics in ten countries in Africa. We find the topic modeling approach greatly improves performance versus reliance on structured keys alone. We validate our results by comparing schools and clinics identified by our OSM method versus those identified by the WHO, and describe OSM coverage gaps more broadly.
translated by 谷歌翻译
We present a new algorithm for automatically bounding the Taylor remainder series. In the special case of a scalar function $f: \mathbb{R} \mapsto \mathbb{R}$, our algorithm takes as input a reference point $x_0$, trust region $[a, b]$, and integer $k \ge 0$, and returns an interval $I$ such that $f(x) - \sum_{i=0}^k \frac {f^{(i)}(x_0)} {i!} (x - x_0)^i \in I (x - x_0)^{k+1}$ for all $x \in [a, b]$. As in automatic differentiation, the function $f$ is provided to the algorithm in symbolic form, and must be composed of known elementary functions. At a high level, our algorithm has two steps. First, for a variety of commonly-used elementary functions (e.g., $\exp$, $\log$), we derive sharp polynomial upper and lower bounds on the Taylor remainder series. We then recursively combine the bounds for the elementary functions using an interval arithmetic variant of Taylor-mode automatic differentiation. Our algorithm can make efficient use of machine learning hardware accelerators, and we provide an open source implementation in JAX. We then turn our attention to applications. Most notably, we use our new machinery to create the first universal majorization-minimization optimization algorithms: algorithms that iteratively minimize an arbitrary loss using a majorizer that is derived automatically, rather than by hand. Applied to machine learning, this leads to architecture-specific optimizers for training deep networks that converge from any starting point, without hyperparameter tuning. Our experiments show that for some optimization problems, these hyperparameter-free optimizers outperform tuned versions of gradient descent, Adam, and AdaGrad. We also show that our automatically-derived bounds can be used for verified global optimization and numerical integration, and to prove sharper versions of Jensen's inequality.
translated by 谷歌翻译
A typical product or place often has hundreds of reviews, and summarization of these texts is an important and challenging problem. Recent progress on abstractive summarization in domains such as news has been driven by supervised systems trained on hundreds of thousands of news articles paired with human-written summaries. However for opinion texts, such large scale datasets are rarely available. Unsupervised methods, self-training, and few-shot learning approaches bridge that gap. In this work, we present a novel self-training approach, OpineSum, for abstractive opinion summarization. The summaries in this approach are built using a novel application of textual entailment and capture the consensus of opinions across the various reviews for an item. This method can be used to obtain silver-standard summaries on a large scale and train both unsupervised and few-shot abstractive summarization systems. OpineSum achieves state-of-the-art performance in both settings.
translated by 谷歌翻译
The applicability of computational models to the biological world is an active topic of debate. We argue that a useful path forward results from abandoning hard boundaries between categories and adopting an observer-dependent, pragmatic view. Such a view dissolves the contingent dichotomies driven by human cognitive biases (e.g., tendency to oversimplify) and prior technological limitations in favor of a more continuous, gradualist view necessitated by the study of evolution, developmental biology, and intelligent machines. Efforts to re-shape living systems for biomedical or bioengineering purposes require prediction and control of their function at multiple scales. This is challenging for many reasons, one of which is that living systems perform multiple functions in the same place at the same time. We refer to this as "polycomputing" - the ability of the same substrate to simultaneously compute different things. This ability is an important way in which living things are a kind of computer, but not the familiar, linear, deterministic kind; rather, living things are computers in the broad sense of computational materials as reported in the rapidly-growing physical computing literature. We argue that an observer-centered framework for the computations performed by evolved and designed systems will improve the understanding of meso-scale events, as it has already done at quantum and relativistic scales. Here, we review examples of biological and technological polycomputing, and develop the idea that overloading of different functions on the same hardware is an important design principle that helps understand and build both evolved and designed systems. Learning to hack existing polycomputing substrates, as well as evolve and design new ones, will have massive impacts on regenerative medicine, robotics, and computer engineering.
translated by 谷歌翻译
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets. Despite promising results, current models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. Several recent efforts attempt to address this by devising models that automatically detect factual inconsistencies in machine generated summaries. However, they focus exclusively on English, a language with abundant resources. In this work, we leverage factual consistency evaluation models to improve multilingual summarization. We explore two intuitive approaches to mitigate hallucinations based on the signal provided by a multilingual NLI model, namely data filtering and controlled generation. Experimental results in the 45 languages from the XLSum dataset show gains over strong baselines in both automatic and human evaluation.
translated by 谷歌翻译