图神经网络(GNN)是一类流行的机器学习模型。受到学习解释(L2X)范式的启发,我们提出了L2XGNN,这是一个可解释的GNN的框架,该框架通过设计提供了忠实的解释。L2XGNN学习了一种选择解释性子图(主题)的机制,该机制仅在GNNS消息通话操作中使用。L2XGNN能够为每个输入图选择具有特定属性的子图,例如稀疏和连接。对主题施加这种限制通常会导致更容易解释和有效的解释。几个数据集的实验表明,L2XGNN使用整个输入图实现了与基线方法相同的分类精度,同时确保仅使用提供的解释来进行预测。此外,我们表明L2XGNN能够识别负责预测图形属性的主题。
translated by 谷歌翻译
本文介绍了AILAB-UDINE团队为SMM4H 22共享任务开发的模型。我们探索了基于变压器的模型在文本分类,实体提取和实体归一化,解决任务1、2、5、6和10的极限。使用集合学习时的不同体系结构,以及生成模型的巨大潜力,以实现术语归一化。
translated by 谷歌翻译
在过去的十年中,越来越多的用户开始在社交媒体平台,博客和健康论坛上报告不良药物事件(ADE)。鉴于大量报告,药物宣传的重点是使用自然语言处理(NLP)技术快速检查这些大量文本收集的方法,从而提到了与药物相关的不良反应对触发医学调查的提及。但是,尽管对任务和NLP的进步越来越兴趣,但面对语言现象(例如否定和猜测),这些模型的鲁棒性是一个公开的研究问题。否定和猜测是自然语言中普遍存在的现象,可以严重阻碍自动化系统区分文本中事实和非事实陈述的能力。在本文中,我们考虑了在社交媒体文本上进行ADE检测的四个最新系统。我们介绍了Snax,这是一种基准测试,以测试其性能,以对包含被否定和推测的ADE的样品进行样本,显示它们针对这些现象的脆弱性。然后,我们引入了两种可能提高这些模型的鲁棒性的可能策略,表明它们俩都带来了大幅提高性能,从而将模型预测的伪造实体数量降低了60%以否定为否定,而猜测为80%。
translated by 谷歌翻译
每小时,大量的视觉内容都会发布在社交媒体和用户生成的内容平台上。为了通过自然语言查询找到相关的视频,在过去几年中,文本视频检索方法受到了越来越多的关注。引入了数据增强技术,以通过应用语义保护技术(例如色彩空间或图像上的几何变换)创建新的训练样本,以提高看不见的测试示例的性能。但是,这些技术通常应用于原始数据,从而导致更多资源要求解决方案,并且还需要具有原始数据的共享性,这可能并不总是如此,例如电影或电视连续剧的剪辑中的版权问题。为了解决这个缺点,我们提出了一种多模式数据增强技术,该技术在功能空间中起作用,并通过混合语义上相似的样本来创建新的视频和字幕。我们在大型公共数据集(Epic-Kitchens-100)上实验解决方案,并对基线方法,改进的最新性能取得了可观的改进,同时进行了多次消融研究。我们在https://github.com/aranciokov/fsmmda_videoretrieval上在github上发布代码和预估计的模型。
translated by 谷歌翻译
通过以人为本的研究(HCR),我们可以引导研究活动,以便研究结果对人类利益相关者(例如最终用户)有益。但是,是什么使研究以人为中心为中心?我们通过提供工作定义来解决这个问题,并定义如何将研究管道分为不同的阶段,在这些阶段中可以添加以人为中心的组件。此外,我们使用HCR组件讨论了现有的NLP,并定义了一系列的指导问题,这些问题可以作为有兴趣探索以人为中心的研究方法的研究人员的起点。我们希望这项工作能够激发研究人员完善所提出的定义,并提出其他对实现HCR有意义的问题。
translated by 谷歌翻译
本报告介绍了我们提交给Epic-kitchens-100多实体检索挑战2022的技术细节。为了参与挑战,我们设计了一个合奏,由不同的模型组成,该模型由两个最近开发的相关版本培训,该版本广泛使用了三胞胎损失。我们的提交在公共排行榜上可见,平均得分为61.02%NDCG和49.77%的地图。
translated by 谷歌翻译
Background: Image analysis applications in digital pathology include various methods for segmenting regions of interest. Their identification is one of the most complex steps, and therefore of great interest for the study of robust methods that do not necessarily rely on a machine learning (ML) approach. Method: A fully automatic and optimized segmentation process for different datasets is a prerequisite for classifying and diagnosing Indirect ImmunoFluorescence (IIF) raw data. This study describes a deterministic computational neuroscience approach for identifying cells and nuclei. It is far from the conventional neural network approach, but it is equivalent to their quantitative and qualitative performance, and it is also solid to adversative noise. The method is robust, based on formally correct functions, and does not suffer from tuning on specific data sets. Results: This work demonstrates the robustness of the method against the variability of parameters, such as image size, mode, and signal-to-noise ratio. We validated the method on two datasets (Neuroblastoma and NucleusSegData) using images annotated by independent medical doctors. Conclusions: The definition of deterministic and formally correct methods, from a functional to a structural point of view, guarantees the achievement of optimized and functionally correct results. The excellent performance of our deterministic method (NeuronalAlg) to segment cells and nuclei from fluorescence images was measured with quantitative indicators and compared with those achieved by three published ML approaches.
translated by 谷歌翻译
The broad usage of mobile devices nowadays, the sensitiveness of the information contained in them, and the shortcomings of current mobile user authentication methods are calling for novel, secure, and unobtrusive solutions to verify the users' identity. In this article, we propose TypeFormer, a novel Transformer architecture to model free-text keystroke dynamics performed on mobile devices for the purpose of user authentication. The proposed model consists in Temporal and Channel Modules enclosing two Long Short-Term Memory (LSTM) recurrent layers, Gaussian Range Encoding (GRE), a multi-head Self-Attention mechanism, and a Block-Recurrent structure. Experimenting on one of the largest public databases to date, the Aalto mobile keystroke database, TypeFormer outperforms current state-of-the-art systems achieving Equal Error Rate (EER) values of 3.25% using only 5 enrolment sessions of 50 keystrokes each. In such way, we contribute to reducing the traditional performance gap of the challenging mobile free-text scenario with respect to its desktop and fixed-text counterparts. Additionally, we analyse the behaviour of the model with different experimental configurations such as the length of the keystroke sequences and the amount of enrolment sessions, showing margin for improvement with more enrolment data. Finally, a cross-database evaluation is carried out, demonstrating the robustness of the features extracted by TypeFormer in comparison with existing approaches.
translated by 谷歌翻译
Digital media have enabled the access to unprecedented literary knowledge. Authors, readers, and scholars are now able to discover and share an increasing amount of information about books and their authors. Notwithstanding, digital archives are still unbalanced: writers from non-Western countries are less represented, and such a condition leads to the perpetration of old forms of discrimination. In this paper, we present the Under-Represented Writers Knowledge Graph (URW-KG), a resource designed to explore and possibly amend this lack of representation by gathering and mapping information about works and authors from Wikidata and three other sources: Open Library, Goodreads, and Google Books. The experiments based on KG embeddings showed that the integrated information encoded in the graph allows scholars and users to be more easily exposed to non-Western literary works and authors with respect to Wikidata alone. This opens to the development of fairer and effective tools for author discovery and exploration.
translated by 谷歌翻译
Content-Controllable Summarization generates summaries focused on the given controlling signals. Due to the lack of large-scale training corpora for the task, we propose a plug-and-play module RelAttn to adapt any general summarizers to the content-controllable summarization task. RelAttn first identifies the relevant content in the source documents, and then makes the model attend to the right context by directly steering the attention weight. We further apply an unsupervised online adaptive parameter searching algorithm to determine the degree of control in the zero-shot setting, while such parameters are learned in the few-shot setting. By applying the module to three backbone summarization models, experiments show that our method effectively improves all the summarizers, and outperforms the prefix-based method and a widely used plug-and-play model in both zero- and few-shot settings. Tellingly, more benefit is observed in the scenarios when more control is needed.
translated by 谷歌翻译