智能论文笔记

Generalizable Black-Box Adversarial Attack with Meta Learning

Fei Yin , Yong Zhang , Baoyuan Wu , Yan Feng , Jingyi Zhang , Yanbo Fan , Yujiu Yang

分类：机器学习 | 计算机视觉

2023-01-01

In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget. Due to the limited feedback information, existing query-based black-box attack methods often require many queries for attacking each benign example. To reduce query cost, we propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. Specifically, by treating the attack on each benign example as one task, we develop a meta-learning framework by training a meta-generator to produce perturbations conditioned on benign examples. When attacking a new benign example, the meta generator can be quickly fine-tuned based on the feedback information of the new task as well as a few historical attacks to produce effective perturbations. Moreover, since the meta-train procedure consumes many queries to learn a generalizable generator, we utilize model-level adversarial transferability to train the meta-generator on a white-box surrogate model, then transfer it to help the attack against the target model. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance, which is verified by extensive experiments.

translated by 谷歌翻译

TCFimt: Temporal Counterfactual Forecasting from Individual Multiple Treatment Perspective

Pengfei Xi , Guifeng Wang , Zhipeng Hu , Yu Xiong , Mingming Gong , Wei Huang , Runze Wu , Yu Ding , Tangjie Lv , Changjie Fan

分类：机器学习 | (统计)机器学习

2022-12-17

Determining causal effects of temporal multi-intervention assists decision-making. Restricted by time-varying bias, selection bias, and interactions of multiple interventions, the disentanglement and estimation of multiple treatment effects from individual temporal data is still rare. To tackle these challenges, we propose a comprehensive framework of temporal counterfactual forecasting from an individual multiple treatment perspective (TCFimt). TCFimt constructs adversarial tasks in a seq2seq framework to alleviate selection and time-varying bias and designs a contrastive learning-based block to decouple a mixed treatment effect into separated main treatment effects and causal interactions which further improves estimation accuracy. Through implementing experiments on two real-world datasets from distinct fields, the proposed method shows satisfactory performance in predicting future outcomes with specific treatments and in choosing optimal treatment type and timing than state-of-the-art methods.

translated by 谷歌翻译

Multiple Object Tracking Challenge Technical Report for Team MT_IoT

Feng Yan , Zhiheng Li , Weixin Luo , Zequn jie , Fan Liang , Xiaolin Wei , Lin Ma

分类：计算机视觉

2022-12-07

This is a brief technical report of our proposed method for Multiple-Object Tracking (MOT) Challenge in Complex Environments. In this paper, we treat the MOT task as a two-stage task including human detection and trajectory matching. Specifically, we designed an improved human detector and associated most of detection to guarantee the integrity of the motion trajectory. We also propose a location-wise matching matrix to obtain more accurate trace matching. Without any model merging, our method achieves 66.672 HOTA and 93.971 MOTA on the DanceTrack challenge dataset.

translated by 谷歌翻译

StegaNeRF: Embedding Invisible Information within Neural Radiance Fields

Chenxin Li , Brandon Y. Feng , Zhiwen Fan , Panwang Pan , Zhangyang Wang

分类：计算机视觉

2022-12-03

Recent advances in neural rendering imply a future of widespread visual data distributions through sharing NeRF model weights. However, while common visual data (images and videos) have standard approaches to embed ownership or copyright information explicitly or subtly, the problem remains unexplored for the emerging NeRF format. We present StegaNeRF, a method for steganographic information embedding in NeRF renderings. We design an optimization framework allowing accurate hidden information extractions from images rendered by NeRF, while preserving its original visual quality. We perform experimental evaluations of our method under several potential deployment scenarios, and we further discuss the insights discovered through our analysis. StegaNeRF signifies an initial exploration into the novel problem of instilling customizable, imperceptible, and recoverable information to NeRF renderings, with minimal impact to rendered images. Project page: https://xggnet.github.io/StegaNeRF/.

translated by 谷歌翻译

Conservative-Progressive Collaborative Learning for Semi-supervised Semantic Segmentation

Siqi Fan , Fenghua Zhu , Zunlei Feng , Yisheng Lv , Mingli Song , Fei-Yue Wang

分类：计算机视觉

2022-11-30

Pseudo supervision is regarded as the core idea in semi-supervised learning for semantic segmentation, and there is always a tradeoff between utilizing only the high-quality pseudo labels and leveraging all the pseudo labels. Addressing that, we propose a novel learning approach, called Conservative-Progressive Collaborative Learning (CPCL), among which two predictive networks are trained in parallel, and the pseudo supervision is implemented based on both the agreement and disagreement of the two predictions. One network seeks common ground via intersection supervision and is supervised by the high-quality labels to ensure a more reliable supervision, while the other network reserves differences via union supervision and is supervised by all the pseudo labels to keep exploring with curiosity. Thus, the collaboration of conservative evolution and progressive exploration can be achieved. To reduce the influences of the suspicious pseudo labels, the loss is dynamic re-weighted according to the prediction confidence. Extensive experiments demonstrate that CPCL achieves state-of-the-art performance for semi-supervised semantic segmentation.

translated by 谷歌翻译

Dimensionality-Varying Diffusion Process

Han Zhang , Ruili Feng , Zhantao Yang , Lianghua Huang , Yu Liu , Yifei Zhang , Yujun Shen , Deli Zhao , Jingren Zhou , Fan Cheng

分类：机器学习 | 计算机视觉

2022-11-29

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension. We argue that, considering the spatial redundancy in image signals, there is no need to maintain a high dimensionality in the evolution process, especially in the early generation phase. To this end, we make a theoretical generalization of the forward diffusion process via signal decomposition. Concretely, we manage to decompose an image into multiple orthogonal components and control the attenuation of each component when perturbing the image. That way, along with the noise strength increasing, we are able to diminish those inconsequential components and thus use a lower-dimensional signal to represent the source, barely losing information. Such a reformulation allows to vary dimensions in both training and inference of diffusion models. Extensive experiments on a range of datasets suggest that our approach substantially reduces the computational cost and achieves on-par or even better synthesis performance compared to baseline methods. We also show that our strategy facilitates high-resolution image synthesis and improves FID of diffusion model trained on FFHQ at $1024\times1024$ resolution from 52.40 to 10.46. Code and models will be made publicly available.

translated by 谷歌翻译

Cross-Field Transformer for Diabetic Retinopathy Grading on Two-field Fundus Images

Junlin Hou , Jilan Xu , Fan Xiao , Rui-Wei Zhao , Yuejie Zhang , Haidong Zou , Lina Lu , Wenwen Xue , Rui Feng

分类：计算机视觉

2022-11-26

Automatic diabetic retinopathy (DR) grading based on fundus photography has been widely explored to benefit the routine screening and early treatment. Existing researches generally focus on single-field fundus images, which have limited field of view for precise eye examinations. In clinical applications, ophthalmologists adopt two-field fundus photography as the dominating tool, where the information from each field (i.e.,macula-centric and optic disc-centric) is highly correlated and complementary, and benefits comprehensive decisions. However, automatic DR grading based on two-field fundus photography remains a challenging task due to the lack of publicly available datasets and effective fusion strategies. In this work, we first construct a new benchmark dataset (DRTiD) for DR grading, consisting of 3,100 two-field fundus images. To the best of our knowledge, it is the largest public DR dataset with diverse and high-quality two-field images. Then, we propose a novel DR grading approach, namely Cross-Field Transformer (CrossFiT), to capture the correspondence between two fields as well as the long-range spatial correlations within each field. Considering the inherent two-field geometric constraints, we particularly define aligned position embeddings to preserve relative consistent position in fundus. Besides, we perform masked cross-field attention during interaction to flter the noisy relations between fields. Extensive experiments on our DRTiD dataset and a public DeepDRiD dataset demonstrate the effectiveness of our CrossFiT network. The new dataset and the source code of CrossFiT will be publicly available at https://github.com/FDU-VTS/DRTiD.

translated by 谷歌翻译

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

Andrey Ignatov , Radu Timofte , Maurizio Denna , Abdel Younes , Ganzorig Gankhuyag , Jingang Huh , Myeong Kyun Kim , Kihwan Yoon , Hyeon-Cheol Moon , Seungho Lee

分类：计算机视觉

2022-11-07

Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.

translated by 谷歌翻译

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

Andrey Ignatov , Radu Timofte , Shuai Liu , Chaoyu Feng , Furui Bai , Xiaotao Wang , Lei Lei , Ziyao Yi , Yan Xiang , Zibin Liu

分类：计算机视觉

2022-11-07

The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.

translated by 谷歌翻译

Exploring Example Influence in Continual Learning

Qing Sun , Fan Lyu , Fanhua Shang , Wei Feng , Liang Wan

分类：机器学习

2022-09-25

持续学习（CL）依次学习像人类这样的新任务，其目标是实现更好的稳定性（S，记住过去的任务）和可塑性（P，适应新任务）。由于过去的培训数据不可用，因此探索培训示例中S和P的影响差异很有价值，这可能会改善对更好的SP的学习模式。受影响函数的启发（如果），我们首先研究了示例通过添加扰动来示例体重和计算影响推导的影响。为了避免在神经网络中Hessian逆的存储和计算负担，我们提出了一种简单而有效的METASP算法，以模拟IF计算中的两个关键步骤，并获得S-和P-Aware示例的影响。此外，我们建议通过解决双目标优化问题来融合两种示例影响，并获得对SP Pareto最优性的融合影响。融合影响可用于控制模型的更新并优化排练的存储。经验结果表明，我们的算法在任务和类别基准CL数据集上都显着优于最先进的方法。

translated by 谷歌翻译