智能论文笔记

SnapshotNet: Self-supervised Feature Learning for Point Cloud Data Segmentation Using Minimal Labeled Data

Xingye Li , Ling Zhang , Zhigang Zhu

分类：计算机视觉

2022-01-13

手动注释复杂的场景点云数据集昂贵且容易出错。为了减少对标记数据的依赖性，提出了一种名为Snapshotnet的新模型作为自我监督的特征学习方法，它直接用于复杂3D场景的未标记点云数据。 Snapshotnet Pipleine包括三个阶段。在快照捕获阶段，从点云场景中采样被定义为本地点的快照。快照可以是直接从真实场景捕获的本地3D扫描的视图，或者从大3D 3D点云数据集中的虚拟视图。也可以在不同的采样率或视野（FOV）的不同采样率或视野（FOV）中进行对快照进行，从而从场景中捕获比例信息。在特征学习阶段，提出了一种名为Multi-FoV对比度的新的预文本任务，以识别两个快照是否来自同一对象，而不是在同一FOV中或跨不同的FOV中。快照通过两个自我监督的学习步骤：对比学习步骤与零件和比例对比度，然后是快照聚类步骤以提取更高的级别语义特征。然后，通过首先培训在学习特征上的标准SVM分类器的培训中实现了弱监督的分割阶段，其中包含少量标记的快照。训练的SVM用于预测输入快照的标签，并使用投票过程将预测标签转换为整个场景的语义分割的点明智标签分配。实验是在语义3D数据集上进行的，结果表明，该方法能够从无任何标签的复杂场景数据的快照学习有效特征。此外，当与弱监管点云语义分割的SOA方法相比，该方法已经显示了优势。

translated by 谷歌翻译

Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management

Yuanzheng Li , Shangyang He , Yang Li , Yang Shi , Zhigang Zeng

分类：机器学习

2022-12-29

The utilization of large-scale distributed renewable energy promotes the development of the multi-microgrid (MMG), which raises the need of developing an effective energy management method to minimize economic costs and keep self energy-sufficiency. The multi-agent deep reinforcement learning (MADRL) has been widely used for the energy management problem because of its real-time scheduling ability. However, its training requires massive energy operation data of microgrids (MGs), while gathering these data from different MGs would threaten their privacy and data security. Therefore, this paper tackles this practical yet challenging issue by proposing a federated multi-agent deep reinforcement learning (F-MADRL) algorithm via the physics-informed reward. In this algorithm, the federated learning (FL) mechanism is introduced to train the F-MADRL algorithm thus ensures the privacy and the security of data. In addition, a decentralized MMG model is built, and the energy of each participated MG is managed by an agent, which aims to minimize economic costs and keep self energy-sufficiency according to the physics-informed reward. At first, MGs individually execute the self-training based on local energy operation data to train their local agent models. Then, these local models are periodically uploaded to a server and their parameters are aggregated to build a global agent, which will be broadcasted to MGs and replace their local agents. In this way, the experience of each MG agent can be shared and the energy operation data is not explicitly transmitted, thus protecting the privacy and ensuring data security. Finally, experiments are conducted on Oak Ridge national laboratory distributed energy control communication lab microgrid (ORNL-MG) test system, and the comparisons are carried out to verify the effectiveness of introducing the FL mechanism and the outperformance of our proposed F-MADRL.

translated by 谷歌翻译

InferEM: Inferring the Speaker's Intention for Empathetic Dialogue Generation

Guoqing Lv , Xiaoping Wang , Jiang Li , Zhigang Zeng

分类：自然语言处理

2022-12-13

Current approaches to empathetic response generation typically encode the entire dialogue history directly and put the output into a decoder to generate friendly feedback. These methods focus on modelling contextual information but neglect capturing the direct intention of the speaker. We argue that the last utterance in the dialogue empirically conveys the intention of the speaker. Consequently, we propose a novel model named InferEM for empathetic response generation. We separately encode the last utterance and fuse it with the entire dialogue through multi-head attention based intention fusion module to capture the speaker's intention. Besides, we utilize previous utterances to predict the last utterance, which simulates human's psychology to guess what the interlocutor may speak in advance. To balance the optimizing rates of the utterance prediction and response generation, a multi-task learning strategy is designed for InferEM. Experimental results demonstrate the plausibility and validity of InferEM in improving empathetic expression.

translated by 谷歌翻译

WIDER & CLOSER: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition

Jun-Yu Ma , Beiduo Chen , Jia-Chen Gu , Zhen-Hua Ling , Wu Guo , Quan Liu , Zhigang Chen , Cong Liu

分类：自然语言处理

2022-12-07

Zero-shot cross-lingual named entity recognition (NER) aims at transferring knowledge from annotated and rich-resource data in source languages to unlabeled and lean-resource data in target languages. Existing mainstream methods based on the teacher-student distillation framework ignore the rich and complementary information lying in the intermediate layers of pre-trained language models, and domain-invariant information is easily lost during transfer. In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently. Concretely, a multi-channel distillation framework is designed for sufficient information transfer by aggregating multiple distillers as a mixture. Besides, an unsupervised method adopting parallel domain adaptation is proposed to shorten the channels between the teacher and student models to preserve domain-invariant features. Experiments on four datasets across nine languages demonstrate that the proposed method achieves new state-of-the-art performance on zero-shot cross-lingual NER and shows great generalization and compatibility across languages and fields.

translated by 谷歌翻译

Long-Document Cross-Lingual Summarization

Shaohui Zheng , Zhixu Li , Jiaan Wang , Jianfeng Qu , An Liu , Lei Zhao , Zhigang Chen

分类：自然语言处理 | 人工智能

2022-12-01

Cross-Lingual Summarization (CLS) aims at generating summaries in one language for the given documents in another language. CLS has attracted wide research attention due to its practical significance in the multi-lingual world. Though great contributions have been made, existing CLS works typically focus on short documents, such as news articles, short dialogues and guides. Different from these short texts, long documents such as academic articles and business reports usually discuss complicated subjects and consist of thousands of words, making them non-trivial to process and summarize. To promote CLS research on long documents, we construct Perseus, the first long-document CLS dataset which collects about 94K Chinese scientific documents paired with English summaries. The average length of documents in Perseus is more than two thousand tokens. As a preliminary study on long-document CLS, we build and evaluate various CLS baselines, including pipeline and end-to-end methods. Experimental results on Perseus show the superiority of the end-to-end baseline, outperforming the strong pipeline models equipped with sophisticated machine translation systems. Furthermore, to provide a deeper understanding, we manually analyze the model outputs and discuss specific challenges faced by current approaches. We hope that our work could benchmark long-document CLS and benefit future studies.

translated by 谷歌翻译

Quasi Non-Negative Quaternion Matrix Factorization with Application to Color Face Recognition

Yifen Ke , Changfeng Ma , Zhigang Jia , Yajun Xie , Riwei Liao

分类：计算机视觉

2022-11-30

To address the non-negativity dropout problem of quaternion models, a novel quasi non-negative quaternion matrix factorization (QNQMF) model is presented for color image processing. To implement QNQMF, the quaternion projected gradient algorithm and the quaternion alternating direction method of multipliers are proposed via formulating QNQMF as the non-convex constraint quaternion optimization problems. Some properties of the proposed algorithms are studied. The numerical experiments on the color image reconstruction show that these algorithms encoded on the quaternion perform better than these algorithms encoded on the red, green and blue channels. Furthermore, we apply the proposed algorithms to the color face recognition. Numerical results indicate that the accuracy rate of face recognition on the quaternion model is better than on the red, green and blue channels of color image as well as single channel of gray level images for the same data, when large facial expressions and shooting angle variations are presented.

translated by 谷歌翻译

Overview of CTC 2021: Chinese Text Correction for Native Speakers

Honghong Zhao , Baoxin Wang , Dayong Wu , Wanxiang Che , Zhigang Chen , Shijin Wang

分类：自然语言处理

2022-08-11

在本文中，我们介绍了CTC 2021的概述，这是针对母语人士的中文文本校正任务。我们详细描述了任务定义以及培训和评估的数据。我们还总结了该任务参与者调查的方法。我们希望为此任务收集和注释的数据集可以促进并加快该研究领域的未来发展。因此，伪培训数据，金标准验证数据和整个排行榜可在https://destwang.github.io/ctc2021-explorer/上在线公开获取。

translated by 谷歌翻译

Graph Regularized Nonnegative Latent Factor Analysis Model for Temporal Link Prediction in Cryptocurrency Transaction Networks

Zhou Yue , Liu ZhiGang , Yuan Ye

分类：机器学习

2022-08-03

随着区块链技术的开发，基于区块链技术的加密货币越来越受欢迎。这给出了一个巨大的加密货币交易网络，引起了广泛关注。网络的链接预测学习结构有助于了解网络的机制，因此在加密货币网络中也广泛研究了网络的机制。但是，过去研究中忽略了加密货币交易网络的动态。我们使用图形正则方法将过去的交易记录与未来交易联系起来。基于此，我们提出了一种潜在因子依赖性，非负因子，乘法和图形正规化的已归合性更新（SLF-NMGRU）算法，并进一步提出了图形正则化的非负潜在因子分析（GRNLFA）模型。最后，在真实加密货币交易网络上进行的实验表明，提出的方法提高了准确性和计算效率

translated by 谷歌翻译

UFO: Unified Feature Optimization

Teng Xi , Yifan Sun , Deli Yu , Bi Li , Nan Peng , Gang Zhang , Xinyu Zhang , Zhigang Wang , Jinwen Chen , Jian Wang

分类：计算机视觉

2022-07-21

本文提出了一种新颖的统一特征优化（UFO）范式，用于训练和在现实世界和大规模场景下进行深层模型，这需要集合多个AI功能。不明飞行物的目标是通过对所有任务进行大规模预修。与众所周知的基础模型相比，UFO具有两个不同的重点，即相对较小的模型大小，没有适应性成本：1）UFO以多任务学习方式将广泛的任务挤入中等尺寸的统一模型中并在转移到下游任务时进一步修剪模型大小。 2）不明飞行物不强调转移到新任务。相反，它旨在使修剪模型专门用于一个或多个已经看到的任务。有了这两个特征，UFO为灵活的部署提供了极大的便利，同时保持了大规模预处理的好处。 UFO的一个关键优点是修剪过程不仅可以减少模型的大小和推理消耗，而且还提高了某些任务的准确性。具体而言，UFO考虑了多任务培训，并对统一模型产生了两倍的影响：一些密切相关的任务具有相互利益，而某些任务相互冲突。不明飞行物设法通过新颖的网络体系结构搜索（NAS）方法来减少冲突并保留相互利益。对各种深度表示学习任务（即面部识别，人重新识别，车辆重新识别和产品检索）的实验表明，从UFO中修剪的模型比单件任务训练的对应物更高，但却具有更高的准确性较小的型号大小，验证不明飞行物的概念。此外，UFO还支持发布170亿个参数计算机视觉（CV）基础模型，该模型是该行业中最大的CV模型。

translated by 谷歌翻译

TENET: Transformer Encoding Network for Effective Temporal Flow on Motion Prediction

Yuting Wang , Hangning Zhou , Zhigang Zhang , Chen Feng , Huadong Lin , Chaofei Gao , Yizhi Tang , Zhenting Zhao , Shiyu Zhang , Jie Guo

分类：计算机视觉 | 人工智能

2022-06-30

该技术报告提出了一种有效的自动驾驶运动预测方法。我们开发了一种基于变压器的方法，用于输入编码和轨迹预测。此外，我们提出了时间流动头来增强轨迹编码。最后，使用了有效的K均值集合方法。使用我们的变压器网络和集合方法，我们以1.90的最新Brier-Minfde得分赢得了Argoverse 2 Motion预测挑战的第一名。

translated by 谷歌翻译