智能论文笔记

A Flexible Diffusion Model

Weitao Du , Tao Yang , He Zhang , Yuanqi Du

分类：机器学习 | 人工智能 | 计算机视觉

2022-06-17

扩散（基于得分）生成模型已被广泛用于建模各种类型的复杂数据，包括图像，音频和点云。最近，已经揭示了前向后的随机微分方程（SDE）和基于扩散的模型之间的深厚连接，并提出了几种新的SDE变体（例如，Sub-VP，批判性抑制的Langevin）。尽管手工制作的固定前进SDE取得了经验成功，但仍未探索大量适当的正向SDE。在这项工作中，我们提出了一个通用框架，用于参数化扩散模型，尤其是正向SDE的空间部分。引入了一种抽象的形式主义，并具有理论保证，并且它与以前的扩散模型的联系得到了利用。我们从优化的角度展示了我们方法的理论优势。还提出了关于合成数据集，矿工和CIFAR10的数值实验，以验证我们框架的有效性。

translated by 谷歌翻译

SE(3) Equivariant Graph Neural Networks with Complete Local Frames

Weitao Du , He Zhang , Yuanqi Du , Qi Meng , Wei Chen , Bin Shao , Tie-Yan Liu

分类：人工智能 | 机器学习

2021-10-26

群体模棱两可（例如，SE（3）均衡性）是科学的关键物理对称性，从经典和量子物理学到计算生物学。它可以在任意参考转换下实现强大而准确的预测。鉴于此，已经为将这种对称性编码为深神经网络而做出了巨大的努力，该网络已被证明可以提高下游任务的概括性能和数据效率。构建模棱两可的神经网络通常会带来高计算成本以确保表现力。因此，如何更好地折衷表现力和计算效率在模棱两可的深度学习模型的设计中起着核心作用。在本文中，我们提出了一个框架来构建可以有效地近似几何量的se（3）等效图神经网络。受差异几何形状和物理学的启发，我们向图形神经网络介绍了局部完整帧，因此可以将以给定订单的张量信息投射到框架上。构建本地框架以形成正常基础，以避免方向变性并确保完整性。由于框架仅是由跨产品操作构建的，因此我们的方法在计算上是有效的。我们在两个任务上评估我们的方法：牛顿力学建模和平衡分子构象的产生。广泛的实验结果表明，我们的模型在两种类型的数据集中达到了最佳或竞争性能。

translated by 谷歌翻译

Towards Frame Rate Agnostic Multi-Object Tracking

Weitao Feng , Lei Bai , Yongqiang Yao , Fengwei Yu , Wanli Ouyang

分类：计算机视觉

2022-09-23

多对象跟踪（MOT）是最基本的计算机视觉任务之一，它有助于各种视频分析应用程序。尽管最近取得了有希望的进展，但当前的MOT研究仍仅限于输入流的固定采样帧速率。实际上，我们从经验上发现，当输入帧速率变化时，所有最新最新跟踪器的准确性都会急剧下降。对于更智能的跟踪解决方案，我们将研究工作的注意力转移到了帧速率不可知MOT（FRAMOT）的问题上。在本文中，我们建议使用定期培训计划（FAPS）的帧速率不可知的MOT框架，以首次解决FRAMOT问题。具体而言，我们提出了一个帧速率不可知协会模块（FAAM），该模块（FAAM）渗透并编码帧速率信息，以帮助跨多帧速率输入的身份匹配，从而提高了学习模型在处理FRAMOT中复杂的运动体验关系方面的能力。此外，FRAMOT中训练和推理之间的关联差距扩大，因为训练中未包含的那些后处理步骤在较低的帧速率方案中产生了更大的影响。为了解决这个问题，我们建议定期培训计划（PTS），以通过跟踪模式匹配和融合来反映培训中的所有后处理步骤。除了提出的方法外，我们首次尝试以两种不同的模式（即已知的帧速率和未知帧速率）建立这项新任务的评估方法，旨在处理更复杂的情况。在具有挑战性的MOT数据集（FRAMOT版本）上进行的定量实验清楚地表明，所提出的方法可以更好地处理不同的帧速率，从而改善对复杂情况的鲁棒性。

translated by 谷歌翻译

Semi-supervised Deep Multi-view Stereo

Hongbin Xu , Zhipeng Zhou , Weitao Cheng , Baigui Sun , Hao Li , Wenxiong Kang

分类：计算机视觉 | 人工智能

2022-07-24

在受监督和无监督的设置的基于学习的多视图立体声（MV）中，已经看到了重大进展。为了结合其在准确性和完整性方面的优点，同时减少了对昂贵标签数据的需求，本文探讨了一种新型的基于学习的MVS问题的新型半监督设置，该设置只有MVS数据的一小部分与密集的深度地面真相相连。但是，由于方案和视图中灵活的设置的巨大变化，半监督的MVS问题（半MV）可能会破坏经典的半监督学习中的基本假设，该假设未标记的数据和标记的数据共享相同的标签空间和数据分布。为了解决这些问题，我们提出了一个新颖的半监督MVS框架，即SE-MVS。对于基本假设在MVS数据中起作用的简单情况，一致性正则化鼓励模型预测在原始样本和随机增强样品之间通过KL差异的限制保持一致。对于MVS数据中基本假设有冲突的进一步麻烦案例，我们提出了一种新型的样式一致性损失，以减轻分布差距引起的负面影响。未标记的样品的视觉样式被转移到标记的样品中以缩小差距，并且在原始标记的样品中使用标签进一步监督了生成样品的模型预测。 DTU，BlendenDMV，GTA-SFM和Tanks \＆Temples数据集的实验结果显示了该方法的出色性能。在骨干网络中使用相同的设置，我们提出的SE-MV优于其完全监督和无监督的基线。

translated by 谷歌翻译

Long-Tail Prediction Uncertainty Aware Trajectory Planning for Self-driving Vehicles

Weitao Zhou , Zhong Cao , Nanshan Deng , Xiaoyu Liu , Kun Jiang , Diange Yang

分类：人工智能

2022-07-02

自主驾驶的典型轨迹计划通常依赖于预测周围障碍的未来行为。近年来，由于其令人印象深刻的性能，基于深度学习的预测模型已被广泛使用。但是，最近的研究表明，在长尾驾驶场景分布之后，在数据集上训练的深度学习模型将遭受“尾巴”的大量预测错误，这可能会导致计划者的失败。为此，这项工作定义了预测模型不确定性的概念，以量化由于数据稀疏而导致的高错误。此外，这项工作提出了一个轨迹规划师，以考虑对更安全性能的这种预测不确定性。首先，由于培训数据不足而导致的预测模型的不确定性是由集成网络结构估算的。然后，轨迹规划师的设计目的是考虑预测不确定性引起的最坏情况。结果表明，在数据不足引起的预测不确定性下，提出的方法可以提高轨迹计划的安全性。同时，使用足够的数据，该框架不会导致过度保守的结果。这项技术有助于在现实世界的长尾数据分布下提高自动驾驶汽车的安全性和可靠性。

translated by 谷歌翻译

Eventor: An Efficient Event-Based Monocular Multi-View Stereo Accelerator on FPGA Platform

Mingjun Li , Jianlei Yang , Yingjie Qi , Meng Dong , Yuhao Yang , Runze Liu , Weitao Pan , Bei Yu , Weisheng Zhao

分类：计算机视觉

2022-03-29

事件摄像机是受到生物启发的视觉传感器，异步代表像素级亮度随着事件流而变化。基于事件的单眼多视图立体声（EMV）是一种利用事件流以估算具有已知轨迹的半密度3D结构的技术。对于基于事件的单眼大满贯，这是一项关键任务。但是，所需的密集计算工作负载使其对于嵌入式平台上的实时部署而具有挑战性。在本文中，通过实现最关键和最耗时的阶段，包括事件反向预测和FPGA上的体积射线计数，提出Eventor作为快速有效的EMV加速器。高度平行且完全管道的处理元素是通过FPGA专门设计的，并与嵌入式臂集成为异质系统，以改善吞吐量并减少记忆足迹。同时，通过重新安排，近似计算和混合数据量化，将EMVS算法重新制定为更硬件的方式。戴维斯数据集的评估结果表明，与英特尔i5 CPU平台相比，Eventor的能源效率最高可提高$ 24 \ times $。

translated by 谷歌翻译

Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation

Han Huang , Leilei Sun , Bowen Du , Weifeng Lv

分类：机器学习

2023-01-01

Learning the underlying distribution of molecular graphs and generating high-fidelity samples is a fundamental research problem in drug discovery and material science. However, accurately modeling distribution and rapidly generating novel molecular graphs remain crucial and challenging goals. To accomplish these goals, we propose a novel Conditional Diffusion model based on discrete Graph Structures (CDGS) for molecular graph generation. Specifically, we construct a forward graph diffusion process on both graph structures and inherent features through stochastic differential equations (SDE) and derive discrete graph structures as the condition for reverse generative processes. We present a specialized hybrid graph noise prediction model that extracts the global context and the local node-edge dependency from intermediate graph states. We further utilize ordinary differential equation (ODE) solvers for efficient graph sampling, based on the semi-linear structure of the probability flow ODE. Experiments on diverse datasets validate the effectiveness of our framework. Particularly, the proposed method still generates high-quality molecular graphs in a limited number of steps.

translated by 谷歌翻译

HUSP-SP: Faster Utility Mining on Sequence Data

Chunkai Zhang , Yuting Yang , Zilin Du , Wensheng Gan , Philip S. Yu

分类：人工智能

2022-12-29

High-utility sequential pattern mining (HUSPM) has emerged as an important topic due to its wide application and considerable popularity. However, due to the combinatorial explosion of the search space when the HUSPM problem encounters a low utility threshold or large-scale data, it may be time-consuming and memory-costly to address the HUSPM problem. Several algorithms have been proposed for addressing this problem, but they still cost a lot in terms of running time and memory usage. In this paper, to further solve this problem efficiently, we design a compact structure called sequence projection (seqPro) and propose an efficient algorithm, namely discovering high-utility sequential patterns with the seqPro structure (HUSP-SP). HUSP-SP utilizes the compact seq-array to store the necessary information in a sequence database. The seqPro structure is designed to efficiently calculate candidate patterns' utilities and upper bound values. Furthermore, a new upper bound on utility, namely tighter reduced sequence utility (TRSU) and two pruning strategies in search space, are utilized to improve the mining performance of HUSP-SP. Experimental results on both synthetic and real-life datasets show that HUSP-SP can significantly outperform the state-of-the-art algorithms in terms of running time, memory usage, search space pruning efficiency, and scalability.

translated by 谷歌翻译

PersonaSAGE: A Multi-Persona Graph Neural Network

Gautam Choudhary , Iftikhar Ahamath Burhanuddin , Eunyee Koh , Fan Du , Ryan A. Rossi

分类：机器学习

2022-12-28

Graph Neural Networks (GNNs) have become increasingly important in recent years due to their state-of-the-art performance on many important downstream applications. Existing GNNs have mostly focused on learning a single node representation, despite that a node often exhibits polysemous behavior in different contexts. In this work, we develop a persona-based graph neural network framework called PersonaSAGE that learns multiple persona-based embeddings for each node in the graph. Such disentangled representations are more interpretable and useful than a single embedding. Furthermore, PersonaSAGE learns the appropriate set of persona embeddings for each node in the graph, and every node can have a different number of assigned persona embeddings. The framework is flexible enough and the general design helps in the wide applicability of the learned embeddings to suit the domain. We utilize publicly available benchmark datasets to evaluate our approach and against a variety of baselines. The experiments demonstrate the effectiveness of PersonaSAGE for a variety of important tasks including link prediction where we achieve an average gain of 15% while remaining competitive for node classification. Finally, we also demonstrate the utility of PersonaSAGE with a case study for personalized recommendation of different entity types in a data management platform.

translated by 谷歌翻译

NEEDED: Introducing Hierarchical Transformer to Eye Diseases Diagnosis

Xu Ye , Meng Xiao , Zhiyuan Ning , Weiwei Dai , Wenjuan Cui , Yi Du , Yuanchun Zhou

分类：自然语言处理

2022-12-27

With the development of natural language processing techniques(NLP), automatic diagnosis of eye diseases using ophthalmology electronic medical records (OEMR) has become possible. It aims to evaluate the condition of both eyes of a patient respectively, and we formulate it as a particular multi-label classification task in this paper. Although there are a few related studies in other diseases, automatic diagnosis of eye diseases exhibits unique characteristics. First, descriptions of both eyes are mixed up in OEMR documents, with both free text and templated asymptomatic descriptions, resulting in sparsity and clutter of information. Second, OEMR documents contain multiple parts of descriptions and have long document lengths. Third, it is critical to provide explainability to the disease diagnosis model. To overcome those challenges, we present an effective automatic eye disease diagnosis framework, NEEDED. In this framework, a preprocessing module is integrated to improve the density and quality of information. Then, we design a hierarchical transformer structure for learning the contextualized representations of each sentence in the OEMR document. For the diagnosis part, we propose an attention-based predictor that enables traceable diagnosis by obtaining disease-specific information. Experiments on the real dataset and comparison with several baseline models show the advantage and explainability of our framework.

translated by 谷歌翻译