智能论文笔记

Transient motion classification through turbid volumes via parallelized single-photon detection and deep contrastive embedding

Shiqi Xu , Wenhui Liu , Xi Yang , Joakim Jönsson , Ruobing Qian , Paul McKee , Kanghyun Kim , Pavan Chandra Konda , Kevin C. Zhou , Lucas Kreiß

分类：计算机视觉

2022-04-04

在各种科学和临床环境中，快速无创探测空间变化的非相关事件（例如人类头骨下方的脑血流）是一项必不可少的任务。所使用的主要光学技术之一是弥漫性相关光谱（DC），其经典实现使用单个或几个单光子检测器，导致空间定位精度较差，时间分辨率相对较低。 Here, we propose a technique termed Classifying Rapid decorrelation Events via Parallelized single photon dEtection (CREPE)}, a new form of DCS that can probe and classify different decorrelating movements hidden underneath turbid volume with high sensitivity using parallelized speckle detection from a $32\times32 $像素SPAD阵列。我们通过对隐藏在5mm组织样的幻影下的不同时空 - 偏置模式进行分类来评估我们的设置，该模式由快速反相关的动态散射介质制成。十二个多模式纤维用于从组织幻影表面的不同位置收集散射光。为了验证我们的设置，我们通过在Multi-Kilo-Hertz速率下调制的数字微龙器设备（DMD）以及含有流动流体的容器幻影。除了具有胜过经典无监督学习方法的深层对比学习算法外，我们证明我们的方法可以准确地检测和分类浊度散射介质下的不同瞬态去相关事件（发生在0.1-0.4s中），而无需任何数据标记。这有可能应用于非侵入性的深层组织运动模式，例如在紧凑和静态检测探针内以多赫兹速率识别正常或异常的脑血流事件。

translated by 谷歌翻译

Learning Oriented Remote Sensing Object Detection via Naive Geometric Computing

Yanjie Wang , Xu Zou , Zhijun Zhang , Wenhui Xu , Liqun Chen , Sheng Zhong , Luxin Yan , Guodong Wang

分类：计算机视觉

2021-12-01

检测定向对象以及估计其旋转信息是用于分析遥感图像的一个关键步骤。尽管最近提出了许多方法，但大多数人直接学习在仅单独的一个（例如旋转角度）的监督下预测对象方向或仅为几（例如旋转角度）或几（例如若干坐标）地基值。在训练期间采用了关于提议和旋转信息回归的额外约束，在额外约束，在训练期间采用了更准确的对象检测。为此，我们创新地提出了一种通过Naive几何计算以一致的方式同时学习物体的水平提出，面向建议和旋转角度的机制，作为一个额外的稳定约束（参见图1）。提出了一个导向的中心先前引导标签分配策略，以进一步提高建议的质量，产生更好的性能。广泛的实验表明，配备我们的想法的模型显着优于基线，通过大幅度来实现新的最先进的结果，在推理期间没有任何额外的计算负担。我们提出的想法简单直观，可以随时实现。源代码和培训的型号涉及补充文件。

translated by 谷歌翻译

Imaging dynamics beneath turbid media via parallelized single-photon detection

Shiqi Xu , Xi Yang , Wenhui Liu , Joakim Jonsson , Ruobing Qian , Pavan Chandra Konda , Kevin C. Zhou , Lucas Kreiss , Qionghai Dai , Haoqian Wang

分类：计算机视觉

2021-07-03

通过动态散射介质进行非侵入性光学成像具有许多重要的生物医学应用，但仍然是一项艰巨的任务。尽管标准弥漫成像方法测量光吸收或荧光发射，但也良好的是，散射的相干光的时间相关性通过组织像光强度一样扩散。然而，迄今为止，很少有作品旨在通过实验测量和处理这种时间相关数据，以证明去相关动力学的深度组织视频重建。在这项工作中，我们利用单光子雪崩二极管（SPAD）阵列摄像机同时监视单photon水平的斑点波动的时间动力学，从12种不同的幻影组织通过定制的纤维束阵列传递的位置。然后，我们应用深度神经网络将所获得的单光子测量值转换为迅速去摩擦组织幻像下散射动力学的视频。我们证明了重建瞬态（0.1-0.4s）动态事件的图像的能力，该动态事件发生在非相关的组织幻影下，并以毫米级分辨率进行重构，并突出显示我们的模型如何灵活地扩展到埋藏的phantom船只内的流速。

translated by 谷歌翻译

Goal-guided Transformer-enabled Reinforcement Learning for Efficient Autonomous Navigation

Wenhui Huang , Yanxin Zhou , Xiangkun He , Chen Lv

分类：机器人 | 人工智能 | 机器学习

2023-01-01

Despite some successful applications of goal-driven navigation, existing deep reinforcement learning-based approaches notoriously suffers from poor data efficiency issue. One of the reasons is that the goal information is decoupled from the perception module and directly introduced as a condition of decision-making, resulting in the goal-irrelevant features of the scene representation playing an adversary role during the learning process. In light of this, we present a novel Goal-guided Transformer-enabled reinforcement learning (GTRL) approach by considering the physical goal states as an input of the scene encoder for guiding the scene representation to couple with the goal information and realizing efficient autonomous navigation. More specifically, we propose a novel variant of the Vision Transformer as the backbone of the perception system, namely Goal-guided Transformer (GoT), and pre-train it with expert priors to boost the data efficiency. Subsequently, a reinforcement learning algorithm is instantiated for the decision-making system, taking the goal-oriented scene representation from the GoT as the input and generating decision commands. As a result, our approach motivates the scene representation to concentrate mainly on goal-relevant features, which substantially enhances the data efficiency of the DRL learning process, leading to superior navigation performance. Both simulation and real-world experimental results manifest the superiority of our approach in terms of data efficiency, performance, robustness, and sim-to-real generalization, compared with other state-of-art baselines. Demonstration videos are available at \colorb{https://youtu.be/93LGlGvaN0c.

translated by 谷歌翻译

Toward Improved Generalization: Meta Transfer of Self-supervised Knowledge on Graphs

Wenhui Cui , Haleh Akrami , Anand A. Joshi , Richard M. Leahy

分类：机器学习

2022-12-16

Despite the remarkable success achieved by graph convolutional networks for functional brain activity analysis, the heterogeneity of functional patterns and the scarcity of imaging data still pose challenges in many tasks. Transferring knowledge from a source domain with abundant training data to a target domain is effective for improving representation learning on scarce training data. However, traditional transfer learning methods often fail to generalize the pre-trained knowledge to the target task due to domain discrepancy. Self-supervised learning on graphs can increase the generalizability of graph features since self-supervision concentrates on inherent graph properties that are not limited to a particular supervised task. We propose a novel knowledge transfer strategy by integrating meta-learning with self-supervised learning to deal with the heterogeneity and scarcity of fMRI data. Specifically, we perform a self-supervised task on the source domain and apply meta-learning, which strongly improves the generalizability of the model using the bi-level optimization, to transfer the self-supervised knowledge to the target domain. Through experiments on a neurological disorder classification task, we demonstrate that the proposed strategy significantly improves target task performance by increasing the generalizability and transferability of graph-based knowledge.

translated by 谷歌翻译

BARS: A Benchmark for Airport Runway Segmentation

Wenhui Chen , Zhijiang Zhang , Liang Yu , Yichun Tai

分类：计算机视觉 | 人工智能

2022-10-24

Airport runway segmentation can effectively reduce the accident rate during the landing phase, which has the largest risk of flight accidents. With the rapid development of deep learning, related methods have good performance on segmentation tasks and can be well adapted to complex scenes. However, the lack of large-scale, publicly available datasets in this field makes the development of methods based on deep learning difficult. Therefore, we propose a Benchmark for Airport Runway Segmentation, named BARS. Meanwhile, a semi-automatic annotation pipeline is designed to reduce the workload of annotation. BARS has the largest dataset with the richest categories and the only instance annotation in the field. The dataset, which is collected using the X-Plane simulation platform, contains 10,002 images and 29,347 instances with three categories. We evaluate eight representative instance segmentation methods on BARS and analyze their performance. Based on the characteristic of the airport runway with a regular shape, we propose a plug-and-play smoothing post-processing module (SPPM) and a contour point constraint loss (CPCL) function to smooth segmentation results for mask-based and contour-based methods, respectively. Furthermore, a novel evaluation metric named average smoothness (AS) is developed to measure smoothness. The experiments show that existing instance segmentation methods can achieve prediction results with good performance on BARS. SPPM and CPCL can improve the average accuracy by 0.9% and 1.13%, respectively. And the average smoothness enhancements for SPPM and CPCL are more than 50% and 28%, respectively.

translated by 谷歌翻译

Anisotropic Multi-Scale Graph Convolutional Network for Dense Shape Correspondence

Mohammad Farazi , Wenhui Zhu , Zhangsihao Yang , Yalin Wang

分类：计算机视觉 | 机器学习

2022-10-17

This paper studies 3D dense shape correspondence, a key shape analysis application in computer vision and graphics. We introduce a novel hybrid geometric deep learning-based model that learns geometrically meaningful and discretization-independent features with a U-Net model as the primary node feature extraction module, followed by a successive spectral-based graph convolutional network. To create a diverse set of filters, we use anisotropic wavelet basis filters, being sensitive to both different directions and band-passes. This filter set overcomes the over-smoothing behavior of conventional graph neural networks. To further improve the model's performance, we add a function that perturbs the feature maps in the last layer ahead of fully connected layers, forcing the network to learn more discriminative features overall. The resulting correspondence maps show state-of-the-art performance on the benchmark datasets based on average geodesic errors and superior robustness to discretization in 3D meshes. Our approach provides new insights and practical solutions to the dense shape correspondence research.

translated by 谷歌翻译

Learning to Weight Samples for Dynamic Early-exiting Networks

Yizeng Han , Yifan Pu , Zihang Lai , Chaofei Wang , Shiji Song , Junfen Cao , Wenhui Huang , Chao Deng , Gao Huang

分类：计算机视觉

2022-09-17

早期退出是提高深网推理效率的有效范例。通过构建具有不同资源需求的分类器（出口），此类网络可以在早期出口处输出简单的样本，从而消除了执行更深层的需求。尽管现有作品主要关注多EXIT网络的建筑设计，但此类模型的培训策略在很大程度上没有探索。当前的最新模型在培训期间对所有样品进行了相同的处理。但是，在测试过程中的早期外观行为被忽略了，从而导致训练和测试之间存在差距。在本文中，我们建议通过样品加权来弥合这一差距。从直觉上讲，简单的样品通常在推理期间在网络早期退出，应该为培训早期分类器提供更多贡献。但是，晚期分类器应强调硬样品的培训（主要是从更深层退出）。我们的工作建议采用一个体重预测网络，以加重每个出口处不同训练样本的损失。这个重量预测网络和骨干模型在具有新的优化目标的元学习框架下共同优化。通过将推断期间的适应性行为带入训练阶段，我们表明拟议的加权机制始终提高分类准确性和推理效率之间的权衡。代码可在https://github.com/leaplabthu/l2w-den上找到。

translated by 谷歌翻译

Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks

Wenhui Wang , Hangbo Bao , Li Dong , Johan Bjorck , Zhiliang Peng , Qiang Liu , Kriti Aggarwal , Owais Khan Mohammed , Saksham Singhal , Subhojit Som

分类：计算机视觉 | 自然语言处理

2022-08-22

语言，视觉和多模式预审查的大量融合正在出现。在这项工作中，我们介绍了通用多模式基础模型BEIT-3，该模型BEIT-3，该模型在视觉和视觉任务上都实现了最新的转移性能。具体来说，我们从三个方面提出了大融合：骨干架构，预训练任务和模型扩展。我们介绍了多道路变压器进行通用建模，其中模块化体系结构可以实现深融合和模态特定的编码。基于共享的骨干，我们以统一的方式对图像（Imglish），文本（英语）和图像文本对（“平行句子”）进行蒙面的“语言”建模。实验结果表明，BEIT-3在对象检测（COCO），语义分割（ADE20K），图像分类（Imagenet），视觉推理（NLVR2），视觉询问答案（VQAV2），图像字幕上获得最先进的性能（可可）和跨模式检索（Flickr30k，可可）。

translated by 谷歌翻译

Contrastive Semi-supervised Learning for Domain Adaptive Segmentation Across Similar Anatomical Structures

Ran Gu , Jingyang Zhang , Guotai Wang , Wenhui Lei , Tao Song , Xiaofan Zhang , Kang Li , Shaoting Zhang

分类：计算机视觉

2022-08-18

卷积神经网络（CNN）已经实现了医学图像细分的最先进性能，但需要大量的手动注释进行培训。半监督学习（SSL）方法有望减少注释的要求，但是当数据集大小和注释图像的数量较小时，它们的性能仍然受到限制。利用具有类似解剖结构的现有注释数据集来协助培训，这有可能改善模型的性能。然而，由于目标结构的外观不同甚至成像方式，跨解剖结构域的转移进一步挑战。为了解决这个问题，我们提出了跨解剖结构域适应（CS-CADA）的对比度半监督学习，该学习适应一个模型以在目标结构域中细分相似的结构，这仅需要通过利用一组现有现有的现有的目标域中的限制注释源域中相似结构的注释图像。我们使用特定领域的批归归量表（DSBN）来单独地标准化两个解剖域的特征图，并提出跨域对比度学习策略，以鼓励提取域不变特征。它们被整合到一个自我兼容的均值老师（SE-MT）框架中，以利用具有预测一致性约束的未标记的目标域图像。广泛的实验表明，我们的CS-CADA能够解决具有挑战性的跨解剖结构域移位问题，从而在视网膜血管图像和心脏MR图像的帮助下，在X射线图像中准确分割冠状动脉，并借助底底图像，分别仅给定目标域中的少量注释。

translated by 谷歌翻译