Information Extraction (IE) aims to extract structured information from heterogeneous sources. IE from natural language texts include sub-tasks such as Named Entity Recognition (NER), Relation Extraction (RE), and Event Extraction (EE). Most IE systems require comprehensive understandings of sentence structure, implied semantics, and domain knowledge to perform well; thus, IE tasks always need adequate external resources and annotations. However, it takes time and effort to obtain more human annotations. Low-Resource Information Extraction (LRIE) strives to use unsupervised data, reducing the required resources and human annotation. In practice, existing systems either utilize self-training schemes to generate pseudo labels that will cause the gradual drift problem, or leverage consistency regularization methods which inevitably possess confirmation bias. To alleviate confirmation bias due to the lack of feedback loops in existing LRIE learning paradigms, we develop a Gradient Imitation Reinforcement Learning (GIRL) method to encourage pseudo-labeled data to imitate the gradient descent direction on labeled data, which can force pseudo-labeled data to achieve better optimization capabilities similar to labeled data. Based on how well the pseudo-labeled data imitates the instructive gradient descent direction obtained from labeled data, we design a reward to quantify the imitation process and bootstrap the optimization capability of pseudo-labeled data through trial and error. In addition to learning paradigms, GIRL is not limited to specific sub-tasks, and we leverage GIRL to solve all IE sub-tasks (named entity recognition, relation extraction, and event extraction) in low-resource settings (semi-supervised IE and few-shot IE).
translated by 谷歌翻译
权力是一个不可避免的尚未识别的协作元素。权力动力学影响科学合作的各个方面。团队电力动力学可以通过团队功率级和团队电力层次结构来衡量。团队功率水平概念化为拥有资源,专业知识或团队决策权的平均水平。团队权力层次结构代表了团队中资源财产的垂直差异。在科学科学中,很少有研究从团队权力动力学的角度看过科学合作。本研究探讨了团队权力动力学如何影响团队的影响,以填补研究差距。在这项研究中,一个出版物的所有共同作者被视为一个团队。一支队伍的团队电力水平和团队电力层次由本团队共同作者的职业年龄的平均值和基尼指数来衡量。团队影响由这支球队撰写的文件的引用量化。通过分析来自科学(例如计算机科学,物理学),社会科学(例如,社会学,图书馆和信息科学)和艺术和人文学科(例如,艺术)的770万队,我们发现平坦的团队结构与更高相关团队影响。当团队功率水平增加时,带有低团队电力层次的团队比高队电力层次结构的队伍更多。这些调查结果已经在所有五个学科中重复,除了艺术之外的所有五个学科,以及来自计算机科学的各种类型的团队,包括来自工业或学术界的团队,不同的性别团队的团队,具有地理对比的团队,以及具有独特统一的团队。
translated by 谷歌翻译
Visual localization plays an important role for intelligent robots and autonomous driving, especially when the accuracy of GNSS is unreliable. Recently, camera localization in LiDAR maps has attracted more and more attention for its low cost and potential robustness to illumination and weather changes. However, the commonly used pinhole camera has a narrow Field-of-View, thus leading to limited information compared with the omni-directional LiDAR data. To overcome this limitation, we focus on correlating the information of 360 equirectangular images to point clouds, proposing an end-to-end learnable network to conduct cross-modal visual localization by establishing similarity in high-dimensional feature space. Inspired by the attention mechanism, we optimize the network to capture the salient feature for comparing images and point clouds. We construct several sequences containing 360 equirectangular images and corresponding point clouds based on the KITTI-360 dataset and conduct extensive experiments. The results demonstrate the effectiveness of our approach.
translated by 谷歌翻译
Jacobian and Hessian regularization aim to reduce the magnitude of the first and second-order partial derivatives with respect to neural network inputs, and they are predominantly used to ensure the adversarial robustness of image classifiers. In this work, we generalize previous efforts by extending the target matrix from zero to any matrix that admits efficient matrix-vector products. The proposed paradigm allows us to construct novel regularization terms that enforce symmetry or diagonality on square Jacobian and Hessian matrices. On the other hand, the major challenge for Jacobian and Hessian regularization has been high computational complexity. We introduce Lanczos-based spectral norm minimization to tackle this difficulty. This technique uses a parallelized implementation of the Lanczos algorithm and is capable of effective and stable regularization of large Jacobian and Hessian matrices. Theoretical justifications and empirical evidence are provided for the proposed paradigm and technique. We carry out exploratory experiments to validate the effectiveness of our novel regularization terms. We also conduct comparative experiments to evaluate Lanczos-based spectral norm minimization against prior methods. Results show that the proposed methodologies are advantageous for a wide range of tasks.
translated by 谷歌翻译
实体链接(EL)是将实体提及在文本中及其相应实体中出现在知识库中的过程。通常基于Wikipedia估算实体的EL特征(例如,先前的概率,相关性评分和实体嵌入)。但是,对于刚刚在新闻中发现的新兴实体(EES)而言,它们可能仍未包含在Wikipedia中。结果,它无法获得Wikipedia和EL模型的EES所需的EL功能,将始终无法将歧义提及与这些EES正确链接,因为它没有其EL功能。为了解决这个问题,在本文中,我们专注于以一般方式为新兴实体学习EL功能的新任务。我们提出了一种名为Stamo的新颖方法,可以自动学习EES的高质量EL功能,该功能仅需要从网络中收集的每个EE的少数标记文档,因为它可以进一步利用隐藏在未标记的数据中的知识。 Stamo主要基于自我训练,这使其与任何EL功能或EL模型都灵活地集成在一起,但也使其很容易遭受由错误标签的数据引起的错误加强问题。我们认为自我训练是相对于EES的EL特征,而不是一些试图将错误标签的数据抛弃的常见自我训练策略,而是提出了内部插槽和斜率优化的多重优化过程,以减轻误差加强问题隐含。我们构建了涉及选定的EE的两个EL数据集,以评估EES获得的EL特征的质量,实验结果表明,我们的方法显着优于其他学习EL特征的基线方法。
translated by 谷歌翻译
传统的多播路由方法在构建多播树时存在一些问题,例如对网络状态信息的访问有限,对网络的动态和复杂变化的适应性不佳以及不灵活的数据转发。为了解决这些缺陷,软件定义网络(SDN)中的最佳多播路由问题是根据多目标优化问题量身定制的,以及基于深Q网络(DQN)深度强化学习(DQN)的智能多播路由算法DRL-M4MR( DRL)方法旨在构建SDN中的多播树。首先,通过组合SDN的全局视图和控制,将多播树状态矩阵,链路带宽矩阵,链路延迟矩阵和链路延迟损耗矩阵设计为DRL代理的状态空间。其次,代理的动作空间是网络中的所有链接,而动作选择策略旨在将链接添加到四种情况下的当前多播树。第三,单步和最终奖励功能表格旨在指导智能以做出决定以构建最佳多播树。实验结果表明,与现有算法相比,DRL-M4MR的多播树结构可以在训练后获得更好的带宽,延迟和数据包损耗率,并且可以在动态网络环境中做出更智能的多播路由决策。
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译