切成薄片的Wasserstein(SW)距离已在不同的应用程序场景中广泛使用,因为它可以缩放到大量的支撑量,而不会受到维数的诅咒。切成薄片的瓦斯坦距离的值是通过radon变换(RT)获得的原始度量的一维表示(投影)之间运输成本的平均值。尽管估计切成薄片的瓦斯坦族的支持效率,但仍需要在高维环境中进行相对较大的预测。因此,对于与维度相比,支撑次数相对较少的应用,例如,使用微型批量方法的几个深度学习应用,radon transform的矩阵乘法中的复杂性成为主要计算瓶颈。为了解决这个问题,我们建议通过线性和随机组合少量的预测来得出预测,这些预测被称为瓶颈预测。我们通过引入层次ra transform(HRT)来解释这些投影的用法,该层rad rad transform(HRT)是通过递归应用radon变换变体构建的。然后,我们将方法制定为措施之间的新指标,该指标命名为分层切片瓦斯坦(HSW)距离。通过证明HRT的注入性,我们得出了HSW的指标。此外,我们研究了HSW的理论特性,包括其与SW变体的联系及其计算和样品复杂性。最后,我们将HSW的计算成本和生成质量与常规SW进行比较,使用包括CIFAR10,Celeba和Tiny Imagenet在内的各种基准数据集进行深层生成建模的任务。
translated by 谷歌翻译
传统的切成薄片的瓦斯汀定义在两个具有矢量的概率度量之间。当比较图像的两个概率度量时,从业人员首先需要使用样品矩阵和投影矩阵之间的矩阵乘法来矢量化图像,然后将它们投影到一维空间。之后,通过平均两种相应的一维投影概率度量来评估切片的瓦斯汀。但是,这种方法有两个局限性。第一个限制是,图像的空间结构不会通过矢量化步骤有效地捕获。因此,后来的切片过程变得越来越难以收集差异信息。第二个限制是内存效率低下,因为每个切片方向是具有与图像相同的尺寸的向量。为了解决这些局限性,我们提出了针对基于卷积算子的图像的概率度量,用于切成薄片的新型切片方法。我们通过将步幅,扩张和非线性激活函数纳入卷积算子来得出卷积切成薄片的Wasserstein(CSW)及其变体。我们研究了CSW的指标及其样品复杂性,其计算复杂性以及与常规切片的Wasserstein距离的联系。最后,我们证明了CSW在比较图像和训练图像上的深层生成模型中的概率度量方面的良好性能比传统切成薄片的Wasserstein相比。
translated by 谷歌翻译
寻求信息丰富的投影方向是利用切片的瓦斯坦距离在应用中的重要任务。但是,找到这些方向通常需要在投影方向的空间上进行迭代优化程序,这在计算上很昂贵。此外,在深度学习应用中,计算问题甚至更为严重,其中重复了两次小批次概率度量之间的距离。这个嵌套的环路一直是阻止基于良好预测在实践中的良好预测的切片瓦斯汀距离的主要挑战之一。为了应对这一挑战,我们建议利用学习到优化的技术或摊销优化,以预测任何给定的两种微型批次概率措施的信息方向。据我们所知,这是桥梁摊销优化和切成薄片的生成模型的第一部作品。特别是,我们得出了线性摊销模型,广义线性摊销模型和非线性摊销模型,这些模型对应于三种类型的新型迷你批次损失,称为摊销的切片瓦斯坦。我们证明了在标准基准数据集中深层生成模型中提出的切片损失的良好性能。
translated by 谷歌翻译
Here, we demonstrate how machine learning enables the prediction of comonomers reactivity ratios based on the molecular structure of monomers. We combined multi-task learning, multi-inputs, and Graph Attention Network to build a model capable of predicting reactivity ratios based on the monomers chemical structures.
translated by 谷歌翻译
Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.
translated by 谷歌翻译
Machine Reading Comprehension has become one of the most advanced and popular research topics in the fields of Natural Language Processing in recent years. The classification of answerability questions is a relatively significant sub-task in machine reading comprehension; however, there haven't been many studies. Retro-Reader is one of the studies that has solved this problem effectively. However, the encoders of most traditional machine reading comprehension models in general and Retro-Reader, in particular, have not been able to exploit the contextual semantic information of the context completely. Inspired by SemBERT, we use semantic role labels from the SRL task to add semantics to pre-trained language models such as mBERT, XLM-R, PhoBERT. This experiment was conducted to compare the influence of semantics on the classification of answerability for the Vietnamese machine reading comprehension. Additionally, we hope this experiment will enhance the encoder for the Retro-Reader model's Sketchy Reading Module. The improved Retro-Reader model's encoder with semantics was first applied to the Vietnamese Machine Reading Comprehension task and obtained positive results.
translated by 谷歌翻译
RTE is a significant problem and is a reasonably active research community. The proposed research works on the approach to this problem are pretty diverse with many different directions. For Vietnamese, the RTE problem is moderately new, but this problem plays a vital role in natural language understanding systems. Currently, methods to solve this problem based on contextual word representation learning models have given outstanding results. However, Vietnamese is a semantically rich language. Therefore, in this paper, we want to present an experiment combining semantic word representation through the SRL task with context representation of BERT relative models for the RTE problem. The experimental results give conclusions about the influence and role of semantic representation on Vietnamese in understanding natural language. The experimental results show that the semantic-aware contextual representation model has about 1% higher performance than the model that does not incorporate semantic representation. In addition, the effects on the data domain in Vietnamese are also higher than those in English. This result also shows the positive influence of SRL on RTE problem in Vietnamese.
translated by 谷歌翻译
To the best of our knowledge, this paper made the first attempt to answer whether word segmentation is necessary for Vietnamese sentiment classification. To do this, we presented five pre-trained monolingual S4- based language models for Vietnamese, including one model without word segmentation, and four models using RDRsegmenter, uitnlp, pyvi, or underthesea toolkits in the pre-processing data phase. According to comprehensive experimental results on two corpora, including the VLSP2016-SA corpus of technical article reviews from the news and social media and the UIT-VSFC corpus of the educational survey, we have two suggestions. Firstly, using traditional classifiers like Naive Bayes or Support Vector Machines, word segmentation maybe not be necessary for the Vietnamese sentiment classification corpus, which comes from the social domain. Secondly, word segmentation is necessary for Vietnamese sentiment classification when word segmentation is used before using the BPE method and feeding into the deep learning model. In this way, the RDRsegmenter is the stable toolkit for word segmentation among the uitnlp, pyvi, and underthesea toolkits.
translated by 谷歌翻译
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency to provide tools for effective wildfire management. Early detection of wildfires is essential to minimizing potentially catastrophic destruction. In this paper, we present our work on integrating multiple data sources in SmokeyNet, a deep learning model using spatio-temporal information to detect smoke from wildland fires. Camera image data is integrated with weather sensor measurements and processed by SmokeyNet to create a multimodal wildland fire smoke detection system. We present our results comparing performance in terms of both accuracy and time-to-detection for multimodal data vs. a single data source. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.
translated by 谷歌翻译