Chain event graphs are a family of probabilistic graphical models that generalise Bayesian networks and have been successfully applied to a wide range of domains. Unlike Bayesian networks, these models can encode context-specific conditional independencies as well as asymmetric developments within the evolution of a process. More recently, new model classes belonging to the chain event graph family have been developed for modelling time-to-event data to study the temporal dynamics of a process. However, existing model selection algorithms for chain event graphs and its variants rely on all parameters having conjugate priors. This is unrealistic for many real-world applications. In this paper, we propose a mixture modelling approach to model selection in chain event graphs that does not rely on conjugacy. Moreover, we also show that this methodology is more amenable to being robustly scaled than the existing model selection algorithms used for this family. We demonstrate our techniques on simulated datasets.
translated by 谷歌翻译
Computational units in artificial neural networks follow a simplified model of biological neurons. In the biological model, the output signal of a neuron runs down the axon, splits following the many branches at its end, and passes identically to all the downward neurons of the network. Each of the downward neurons will use their copy of this signal as one of many inputs dendrites, integrate them all and fire an output, if above some threshold. In the artificial neural network, this translates to the fact that the nonlinear filtering of the signal is performed in the upward neuron, meaning that in practice the same activation is shared between all the downward neurons that use that signal as their input. Dendrites thus play a passive role. We propose a slightly more complex model for the biological neuron, where dendrites play an active role: the activation in the output of the upward neuron becomes optional, and instead the signals going through each dendrite undergo independent nonlinear filterings, before the linear combination. We implement this new model into a ReLU computational unit and discuss its biological plausibility. We compare this new computational unit with the standard one and describe it from a geometrical point of view. We provide a Keras implementation of this unit into fully connected and convolutional layers and estimate their FLOPs and weights change. We then use these layers in ResNet architectures on CIFAR-10, CIFAR-100, Imagenette, and Imagewoof, obtaining performance improvements over standard ResNets up to 1.73%. Finally, we prove a universal representation theorem for continuous functions on compact sets and show that this new unit has more representational power than its standard counterpart.
translated by 谷歌翻译
Iterative regularization is a classic idea in regularization theory, that has recently become popular in machine learning. On the one hand, it allows to design efficient algorithms controlling at the same time numerical and statistical accuracy. On the other hand it allows to shed light on the learning curves observed while training neural networks. In this paper, we focus on iterative regularization in the context of classification. After contrasting this setting with that of regression and inverse problems, we develop an iterative regularization approach based on the use of the hinge loss function. More precisely we consider a diagonal approach for a family of algorithms for which we prove convergence as well as rates of convergence. Our approach compares favorably with other alternatives, as confirmed also in numerical simulations.
translated by 谷歌翻译
Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address the problem of reference resolution, when people use natural expressions to choose between real world entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural response from a non-expert may be indirect: `let's make the green one'. Reference resolution has been little studied with natural expressions, thus robustly understanding such language has large potential for improving naturalness in dialog, recommendation, and search systems. We create AltEntities (Alternative Entities), a new public dataset of entity pairs and utterances, and develop models for the disambiguation problem. Consisting of 42K indirect referring expressions across three domains, it enables for the first time the study of how large language models can be adapted to this task. We find they achieve 82%-87% accuracy in realistic settings, which while reasonable also invites further advances.
translated by 谷歌翻译
In this work, we apply a kinetic version of a bounded confidence consensus model to biomedical segmentation problems. In the presented approach, time-dependent information on the microscopic state of each particle/pixel includes its space position and a feature representing a static characteristic of the system, i.e. the gray level of each pixel. From the introduced microscopic model we derive a kinetic formulation of the model. The large time behavior of the system is then computed with the aid of a surrogate Fokker-Planck approach that can be obtained in the quasi-invariant scaling. We exploit the computational efficiency of direct simulation Monte Carlo methods for the obtained Boltzmann-type description of the problem for parameter identification tasks. Based on a suitable loss function measuring the distance between the ground truth segmentation mask and the evaluated mask, we minimize the introduced segmentation metric for a relevant set of 2D gray-scale images. Applications to biomedical segmentation concentrate on different imaging research contexts.
translated by 谷歌翻译
The detection of state-sponsored trolls acting in information operations is an unsolved and critical challenge for the research community, with repercussions that go beyond the online realm. In this paper, we propose a novel AI-based solution for the detection of state-sponsored troll accounts, which consists of two steps. The first step aims at classifying trajectories of accounts' online activities as belonging to either a state-sponsored troll or to an organic user account. In the second step, we exploit the classified trajectories to compute a metric, namely "troll score", which allows us to quantify the extent to which an account behaves like a state-sponsored troll. As a study case, we consider the troll accounts involved in the Russian interference campaign during the 2016 US Presidential election, identified as Russian trolls by the US Congress. Experimental results show that our approach identifies accounts' trajectories with an AUC close to 99\% and, accordingly, classify Russian trolls and organic users with an AUC of 97\%. Finally, we evaluate whether the proposed solution can be generalized to different contexts (e.g., discussions about Covid-19) and generic misbehaving users, showing promising results that will be further expanded in our future endeavors.
translated by 谷歌翻译
我们介绍并讨论了一个运行时体系结构,该架构将感官数据和分类器与基于逻辑的决策系统集成在一起,并在电子健康系统的背景下,用于康复神经运动障碍儿童。在此应用程序中,儿童以游戏的形式执行康复任务。该系统的主要目的是从可用的传感器和分类器(例如,眼镜跟踪器,运动传感器,情感识别技术)中得出一组儿童当前的认知和行为表现(例如参与,注意力,任务准确性)的参数。 )并做出相应的决定。这些决策通常旨在通过在注意力较低时触发适当的重新参与刺激,改变游戏或使孩子对任务失去兴趣时的困难来改善孩子的表现,因为它太容易了。除了对情绪识别和头部姿势估计的最新技术外,我们还使用了事件计算的概率和认知逻辑编程方言的运行时变体,称为认识论概率概率事件。特别是,该符号框架的概率组成部分允许与机器学习技术的自然接口。我们概述了体系结构及其组件,并通过讨论运行的示例和实验来展示其一些特征。正在考虑逻辑编程理论和实践(TPLP)的出版物。
translated by 谷歌翻译
对制造工艺的机器化的需求很大,因此单调劳动。一些需要特定技能的制造任务(焊接,绘画等)缺乏工人。机器人已在这些任务中使用,但是它们的灵活性受到限制,因为它们仍然很难通过非专家编程/重新编程,从而使它们无法访问大多数公司。机器人离线编程(OLP)是可靠的。但是,直接来自CAD/CAM的生成路径不包括代表人类技能的相关参数,例如机器人最终效应器的方向和速度。本文提出了一个直观的机器人编程系统,以捕捉人类制造技能并将其转变为机器人程序。使用连接到工作工具的磁跟踪系统记录人类熟练工人的演示。收集的数据包括工作路径的方向和速度。位置数据是从CAD/CAM中提取的,因为磁跟踪器捕获时的误差很明显。路径姿势在笛卡尔空间中转换,并在模拟环境中进行验证。生成机器人程序并将其转移到真正的机器人。关于玻璃粘合剂应用过程的实验证明了拟议框架捕获人类技能并将其转移到机器人方面的使用和有效性的直觉。
translated by 谷歌翻译
神经肌肉疾病,例如脊柱肌肉萎缩(SMA)和Duchenne肌肉营养不良症(DMD),导致6,000名儿童中有1例的渐进性肌肉变性和运动功能丧失。传统的上肢运动功能评估不能定量测量患者的性能,这使得很难跟踪进度的增量变化。评估神经肌肉疾病儿童的运动功能特别具有挑战性,因为他们在实验过程中可能会紧张或兴奋,或者简直太年轻而无法遵循精确的说明。这些挑战转化为混杂因素,例如执行臂卷曲的不同部分较慢或更快(相位变异性),从而影响评估的运动质量。本文使用曲线注册和形状分析来暂时对齐轨迹,同时提取平均参考形状。距这种平均形状的距离用于评估运动质量。所提出的指标是混杂因素(例如相位变异性)的不变性,同时提出了几种临床相关的见解。首先,控制和患者人群的功能分数在统计上存在显着差异(p $ = $ 0.0213 $ \ le $ 0.05)。接下来,患者队列中的几名患者能够与健康队列进行运动,反之亦然。我们的指标是根据可穿戴设备计算的,与Brooke的分数有关((P $ = $ 0.00063 $ \ le $ $ 0.05))以及基于功能测定法的电动机功能评估((P $ = $ = $ 0.0006 $ \ le $ 0.05)) 。这些结果表明了日常生活中无处不在的运动质量评估的希望。
translated by 谷歌翻译
计算流体动力学(CFD)可用于模拟血管血流动力学并分析潜在的治疗方案。 CFD已显示对改善患者预后有益。但是,尚未实现CFD的实施CFD。 CFD的障碍包括高计算资源,设计模拟设置所需的专业经验以及较长的处理时间。这项研究的目的是探索使用机器学习(ML)以自动和快速回归模型复制常规主动脉CFD。用于训练/测试的数据该模型由在合成生成的3D主动脉形状上执行的3,000个CFD模拟组成。这些受试者是由基于实际患者特异性主动脉(n = 67)的统计形状模型(SSM)生成的。对200个测试形状进行的推理导致压力和速度的平均误差分别为6.01%+/- 3.12 SD和3.99%+/- 0.93 SD。我们的基于ML的模型在〜0.075秒内执行CFD(比求解器快4,000倍)。这项研究表明,可以使用ML以更快的速度,自动过程和高精度来复制常规血管CFD的结果。
translated by 谷歌翻译