智能论文笔记

NeuralMarker: A Framework for Learning General Marker Correspondence

Zhaoyang Huang , Xiaokun Pan , Weihong Pan , Weikang Bian , Yan Xu , Ka Chun Cheung , Guofeng Zhang , Hongsheng Li

分类：计算机视觉

2022-09-19

我们解决了从一般标记（例如电影海报）估计对应关系到捕获这种标记的图像的问题。通常，通过拟合基于稀疏特征匹配的同型模型来解决此问题。但是，他们只能处理类似平面的标记，而稀疏功能不能充分利用外观信息。在本文中，我们提出了一个新颖的框架神经标记器，训练神经网络估计在各种具有挑战性的条件下（例如标记变形，严格的照明等）估算密集标记的对应关系。此外，我们还提出了一种新颖的标记通信评估方法，对真实标记的注释进行了注释。 - 图像对并创建一个新的基准测试。我们表明，神经标记的表现明显优于以前的方法，并实现了新的有趣应用程序，包括增强现实（AR）和视频编辑。

translated by 谷歌翻译

Learning Degradation Representations for Image Deblurring

Dasong Li , Yi Zhang , Ka Chun Cheung , Xiaogang Wang , Hongwei Qin , Hongsheng Li

分类：计算机视觉

2022-08-10

在各种基于学习的图像恢复任务（例如图像降解和图像超分辨率）中，降解表示形式被广泛用于建模降解过程并处理复杂的降解模式。但是，在基于学习的图像deblurring中，它们的探索程度较低，因为在现实世界中挑战性的情况下，模糊内核估计不能很好地表现。我们认为，对于图像降低的降解表示形式是特别必要的，因为模糊模式通常显示出比噪声模式或高频纹理更大的变化。在本文中，我们提出了一个框架来学习模糊图像的空间自适应降解表示。提出了一种新颖的联合图像re毁和脱蓝色的学习过程，以提高降解表示的表现力。为了使学习的降解表示有效地启动和降解，我们提出了一个多尺度退化注入网络（MSDI-NET），以将它们集成到神经网络中。通过集成，MSDI-NET可以适应各种复杂的模糊模式。 GoPro和Realblur数据集上的实验表明，我们提出的具有学识渊博的退化表示形式的Deblurring框架优于最先进的方法，具有吸引人的改进。该代码在https://github.com/dasongli1/learning_degradation上发布。

translated by 谷歌翻译

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

Xuesong Chen , Shaoshuai Shi , Benjin Zhu , Ka Chun Cheung , Hang Xu , Hongsheng Li

分类：计算机视觉

2022-05-12

准确可靠的3D检测对于包括自动驾驶车辆和服务机器人在内的许多应用至关重要。在本文中，我们提出了一个具有点云序列的3D时间对象检测的灵活且高性能的3D检测框架，称为MPPNET。我们提出了一个新颖的三级结构框架，其中包含多帧特征编码和相互作用的代理点，以实现更好的检测。这三个层次结构分别进行每个帧的特征编码，短片特征融合和整个序列特征聚合。为了使用合理的计算资源来处理长期序列云，提出了组内特征混合和组间特征的注意，以形成第二和第三个特征编码层次结构，这些层次结构均经常应用于聚集多框架轨迹特征。代理不仅可以充当每个帧的一致对象表示，而且还充当了方便框架之间特征交互的快递。大型Waymo打开数据集的实验表明，当应用于短（例如4框架）和长（例如16框架）点云序列时，我们的方法优于具有较大边缘的最先进方法。代码可在https://github.com/open-mmlab/openpcdet上找到。

translated by 谷歌翻译

FlowFormer: A Transformer Architecture for Optical Flow

Zhaoyang Huang , Xiaoyu Shi , Chao Zhang , Qiang Wang , Ka Chun Cheung , Hongwei Qin , Jifeng Dai , Hongsheng Li

分类：计算机视觉

2022-03-30

我们介绍了光流变压器，被称为流动型，这是一种基于变压器的神经网络体系结构，用于学习光流。流动形式将图像对构建的4D成本量构成，将成本令牌编码为成本记忆，并在新颖的潜在空间中使用备用组变压器（AGT）层编码成本记忆，并通过反复的变压器解码器与动态位置成本查询来解码成本记忆。在SINTEL基准测试中，流动型在干净和最终通行证上达到1.144和2.183平均末端PONIT-ERROR（AEPE），从最佳发布的结果（1.388和2.47）降低了17.6％和11.6％的误差。此外，流程度还达到了强大的概括性能。在不接受Sintel的培训的情况下，FlowFormer在Sintel训练套装清洁通行证上达到了0.95 AEPE，优于最佳发布结果（1.29），提高了26.9％。

translated by 谷歌翻译

Self-distillation with Batch Knowledge Ensembling Improves ImageNet Classification

Yixiao Ge , Xiao Zhang , Ching Lam Choi , Ka Chun Cheung , Peipei Zhao , Feng Zhu , Xiaogang Wang , Rui Zhao , Hongsheng Li

分类：计算机视觉

2021-04-27

最近对知识蒸馏的研究发现，组合来自多位教师或学生的“黑暗知识”是有助于为培训创造更好的软目标，但以更大的计算和/或参数的成本为本。在这项工作中，我们通过在同一批量中传播和集合其他样本的知识来提供批处理知识合奏（烘焙）以生产用于锚固图像的精细柔软目标。具体地，对于每个感兴趣的样本，根据采样间的亲和力加权知识的传播，其与当前网络一起估计。然后可以集合传播的知识以形成更好的蒸馏靶。通过这种方式，我们的烘焙框架只通过单个网络跨多个样本进行在线知识。与现有知识合并方法相比，它需要最小的计算和内存开销。广泛的实验表明，轻质但有效的烘烤始终如一地提升多个数据集上各种架构的分类性能，例如，在想象网上的显着+ 0.7％的VINE-T的增益，只有+ 1.5％计算开销和零附加参数。烘焙不仅改善了Vanilla基线，还超越了所有基准的单一网络最先进。

translated by 谷歌翻译

ICME 2022 Few-shot LOGO detection top 9 solution

Ka Ho Tong , Ka Wai Cheung , Xiaochuan Yu

分类：计算机视觉 | 人工智能

2022-06-23

ICME-2022在2022年5月举行了几杆徽标检测竞赛。参与者必须开发单个模型来通过处理小型徽标实例，类似的品牌和对抗性图像来检测徽标，并具有有限的注释。我们的球队在比赛的第一轮和第二轮中分别获得16和11，最后排名第9。该技术报告总结了我们在这项比赛中使用的主要技术以及潜在的改进。

translated by 谷歌翻译

A dynamic programming algorithm for informative measurements and near-optimal path-planning

Peter N. Loxley , Ka Wai Cheung

分类：机器学习 | 人工智能 | 机器人

2021-09-24

信息性测量是获取有关未知状态信息的最有效方法。我们给出了一般目的动态编程算法的第一原理推导，通过顺序地最大化可能的测量结果的熵来返回一系列信息测量。该算法可以由自主代理或机器人使用，以确定最佳测量的位置，规划对应于信息序列的最佳信息序列的路径。该算法适用于具有连续或离散的状态和控制，以及随机或确定性的代理动态;包括马尔可夫决策过程。最近的近似动态规划和强化学习的结果，包括卷展栏和蒙特卡罗树搜索等在线近似，允许代理或机器人实时解决测量任务。由此产生的近最佳溶液包括非近视路径和测量序列，其通常可以优于超过，有时基本上使用的贪婪启发式，例如最大化每个测量结果的熵。这是针对全球搜索问题的说明，其中发现使用扩展本地搜索的在线规划来减少搜索中的测量数。

translated by 谷歌翻译

Function Approximation for Solving Stackelberg Equilibrium in Large Perfect Information Games

Chun Kai Ling , J. Zico Kolter , Fei Fang

分类：人工智能

2022-12-29

Function approximation (FA) has been a critical component in solving large zero-sum games. Yet, little attention has been given towards FA in solving \textit{general-sum} extensive-form games, despite them being widely regarded as being computationally more challenging than their fully competitive or cooperative counterparts. A key challenge is that for many equilibria in general-sum games, no simple analogue to the state value function used in Markov Decision Processes and zero-sum games exists. In this paper, we propose learning the \textit{Enforceable Payoff Frontier} (EPF) -- a generalization of the state value function for general-sum games. We approximate the optimal \textit{Stackelberg extensive-form correlated equilibrium} by representing EPFs with neural networks and training them by using appropriate backup operations and loss functions. This is the first method that applies FA to the Stackelberg setting, allowing us to scale to much larger games while still enjoying performance guarantees based on FA error. Additionally, our proposed method guarantees incentive compatibility and is easy to evaluate without having to depend on self-play or approximate best-response oracles.

translated by 谷歌翻译

Safe Subgame Resolving for Extensive Form Correlated Equilibrium

Chun Kai Ling , Fei Fang

分类：人工智能

2022-12-29

Correlated Equilibrium is a solution concept that is more general than Nash Equilibrium (NE) and can lead to outcomes with better social welfare. However, its natural extension to the sequential setting, the \textit{Extensive Form Correlated Equilibrium} (EFCE), requires a quadratic amount of space to solve, even in restricted settings without randomness in nature. To alleviate these concerns, we apply \textit{subgame resolving}, a technique extremely successful in finding NE in zero-sum games to solving general-sum EFCEs. Subgame resolving refines a correlation plan in an \textit{online} manner: instead of solving for the full game upfront, it only solves for strategies in subgames that are reached in actual play, resulting in significant computational gains. In this paper, we (i) lay out the foundations to quantify the quality of a refined strategy, in terms of the \textit{social welfare} and \textit{exploitability} of correlation plans, (ii) show that EFCEs possess a sufficient amount of independence between subgames to perform resolving efficiently, and (iii) provide two algorithms for resolving, one using linear programming and the other based on regret minimization. Both methods guarantee \textit{safety}, i.e., they will never be counterproductive. Our methods are the first time an online method has been applied to the correlated, general-sum setting.

translated by 谷歌翻译

WL-Align: Weisfeiler-Lehman Relabeling for Aligning Users across Networks via Regularized Representation Learning

Li Liu , Penggang Chen , Xin Li , William K. Cheung , Youmin Zhang , Qun Liu , Guoyin Wang

分类：人工智能 | 机器学习

2022-12-29

Aligning users across networks using graph representation learning has been found effective where the alignment is accomplished in a low-dimensional embedding space. Yet, achieving highly precise alignment is still challenging, especially when nodes with long-range connectivity to the labeled anchors are encountered. To alleviate this limitation, we purposefully designed WL-Align which adopts a regularized representation learning framework to learn distinctive node representations. It extends the Weisfeiler-Lehman Isormorphism Test and learns the alignment in alternating phases of "across-network Weisfeiler-Lehman relabeling" and "proximity-preserving representation learning". The across-network Weisfeiler-Lehman relabeling is achieved through iterating the anchor-based label propagation and a similarity-based hashing to exploit the known anchors' connectivity to different nodes in an efficient and robust manner. The representation learning module preserves the second-order proximity within individual networks and is regularized by the across-network Weisfeiler-Lehman hash labels. Extensive experiments on real-world and synthetic datasets have demonstrated that our proposed WL-Align outperforms the state-of-the-art methods, achieving significant performance improvements in the "exact matching" scenario. Data and code of WL-Align are available at https://github.com/ChenPengGang/WLAlignCode.

translated by 谷歌翻译