我们提出了一种具有有限目标语言数据的交叉语言内容标记的新颖框架,这在预测性能方面显着优于现有的工作。该框架基于最近的邻居架构。它是Vanilla K-最近邻模型的现代实例化,因为我们在所有组件中使用变压器表示。我们的框架可以适应新的源语言实例,而无需从头开始侦察。与基于邻域的方法的事先工作不同,我们基于查询邻的交互对邻居信息进行编码。我们提出了两个编码方案,并使用定性和定量分析显示其有效性。我们的评估结果是来自两个不同数据集的八种语言,用于滥用语言检测,在强大的基线上,可以在F1中显示最多9.5(对于意大利语)的大量改进。平均水平,我们在拼图式多语言数据集中的三种语言中实现了3.6的F1改进,2.14在WUL数据集的F1中的改进。
translated by 谷歌翻译
与汽车和其他公路车辆相比,公共汽车和重型车辆由于其尺寸较大而具有更多的盲点。因此,这些重型车辆造成的事故更具致命性,并给其他道路使用者造成严重伤害。这些可能的盲点碰撞可以使用基于视觉的对象检测方法来尽早确定。然而,现有的基于最新视觉的对象检测模型在很大程度上依赖于单个功能描述符来做出决策。在这项研究中,提出了基于高级功能描述符的两个卷积神经网络(CNN)的设计,并提出了它们与更快的R-CNN的集成,以检测重型车辆的盲点碰撞。此外,提出了一种融合方法,以整合两个预训练的网络(即Resnet 50和Resnet 101),用于提取高水平的特征以进行盲点车辆检测。功能的融合显着提高了更快的R-CNN的性能,并优于现有的最新方法。两种方法均在公共汽车的自我录制的盲点车辆检测数据集和用于车辆检测的在线LISA数据集上进行了验证。对于两种提出的方​​法,对于自记录的数据集,可获得3.05%和3.49%的虚假检测率(FDR),使这些方法适用于实时应用。
translated by 谷歌翻译
深度强化学习(DRL)使用多样化的非结构化数据,并使RL能够在高维环境中学习复杂的策略。基于自动驾驶汽车(AVS)的智能运输系统(ITS)为基于政策的DRL提供了绝佳的操场。深度学习体系结构解决了传统算法的计算挑战,同时帮助实现了AV的现实采用和部署。 AVS实施的主要挑战之一是,即使不是可靠和有效地管理的道路上的交通拥堵可能会加剧交通拥堵。考虑到每辆车的整体效果并使用高效和可靠的技术可以真正帮助优化交通流量管理和减少拥堵。为此,我们提出了一个智能的交通管制系统,该系统处理在交叉路口和交叉点后面的复杂交通拥堵场景。我们提出了一个基于DRL的信号控制系统,该系统根据当前交叉点的当前拥塞状况动态调整交通信号。为了应对交叉路口后面的道路上的拥堵,我们使用重新穿线技术来加载道路网络上的车辆。为了实现拟议方法的实际好处,我们分解了数据筒仓,并将所有来自传感器,探测器,车辆和道路结合使用的数据结合起来,以实现可持续的结果。我们使用Sumo微型模拟器进行模拟。我们提出的方法的重要性从结果中体现出来。
translated by 谷歌翻译
Diabetic Retinopathy (DR) is considered one of the primary concerns due to its effect on vision loss among most people with diabetes globally. The severity of DR is mostly comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this paper, we adopt transformer-based learning models to capture the crucial features of retinal images to understand DR severity better. We work with ensembling image transformers, where we adopt four models, namely ViT (Vision Transformer), BEiT (Bidirectional Encoder representation for image Transformer), CaiT (Class-Attention in Image Transformers), and DeiT (Data efficient image Transformers), to infer the degree of DR severity from fundus photographs. For experiments, we used the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.
translated by 谷歌翻译
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.
translated by 谷歌翻译
Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.
translated by 谷歌翻译
Uncertainty quantification is crucial to inverse problems, as it could provide decision-makers with valuable information about the inversion results. For example, seismic inversion is a notoriously ill-posed inverse problem due to the band-limited and noisy nature of seismic data. It is therefore of paramount importance to quantify the uncertainties associated to the inversion process to ease the subsequent interpretation and decision making processes. Within this framework of reference, sampling from a target posterior provides a fundamental approach to quantifying the uncertainty in seismic inversion. However, selecting appropriate prior information in a probabilistic inversion is crucial, yet non-trivial, as it influences the ability of a sampling-based inference in providing geological realism in the posterior samples. To overcome such limitations, we present a regularized variational inference framework that performs posterior inference by implicitly regularizing the Kullback-Leibler divergence loss with a CNN-based denoiser by means of the Plug-and-Play methods. We call this new algorithm Plug-and-Play Stein Variational Gradient Descent (PnP-SVGD) and demonstrate its ability in producing high-resolution, trustworthy samples representative of the subsurface structures, which we argue could be used for post-inference tasks such as reservoir modelling and history matching. To validate the proposed method, numerical tests are performed on both synthetic and field post-stack seismic data.
translated by 谷歌翻译
In recent years distributional reinforcement learning has produced many state of the art results. Increasingly sample efficient Distributional algorithms for the discrete action domain have been developed over time that vary primarily in the way they parameterize their approximations of value distributions, and how they quantify the differences between those distributions. In this work we transfer three of the most well-known and successful of those algorithms (QR-DQN, IQN and FQF) to the continuous action domain by extending two powerful actor-critic algorithms (TD3 and SAC) with distributional critics. We investigate whether the relative performance of the methods for the discrete action space translates to the continuous case. To that end we compare them empirically on the pybullet implementations of a set of continuous control tasks. Our results indicate qualitative invariance regarding the number and placement of distributional atoms in the deterministic, continuous action setting.
translated by 谷歌翻译
Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task.
translated by 谷歌翻译
Autonomous mobile agents such as unmanned aerial vehicles (UAVs) and mobile robots have shown huge potential for improving human productivity. These mobile agents require low power/energy consumption to have a long lifespan since they are usually powered by batteries. These agents also need to adapt to changing/dynamic environments, especially when deployed in far or dangerous locations, thus requiring efficient online learning capabilities. These requirements can be fulfilled by employing Spiking Neural Networks (SNNs) since SNNs offer low power/energy consumption due to sparse computations and efficient online learning due to bio-inspired learning mechanisms. However, a methodology is still required to employ appropriate SNN models on autonomous mobile agents. Towards this, we propose a Mantis methodology to systematically employ SNNs on autonomous mobile agents to enable energy-efficient processing and adaptive capabilities in dynamic environments. The key ideas of our Mantis include the optimization of SNN operations, the employment of a bio-plausible online learning mechanism, and the SNN model selection. The experimental results demonstrate that our methodology maintains high accuracy with a significantly smaller memory footprint and energy consumption (i.e., 3.32x memory reduction and 2.9x energy saving for an SNN model with 8-bit weights) compared to the baseline network with 32-bit weights. In this manner, our Mantis enables the employment of SNNs for resource- and energy-constrained mobile agents.
translated by 谷歌翻译