智能论文笔记

Purifier: Defending Data Inference Attacks via Transforming Confidence Scores

Ziqi Yang , Lijin Wang , Da Yang , Jie Wan , Ziming Zhao , Ee-Chien Chang , Fan Zhang , Kui Ren

分类：机器学习

2022-12-01

Neural networks are susceptible to data inference attacks such as the membership inference attack, the adversarial model inversion attack and the attribute inference attack, where the attacker could infer useful information such as the membership, the reconstruction or the sensitive attributes of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a method, namely PURIFIER, to defend against membership inference attacks. It transforms the confidence score vectors predicted by the target classifier and makes purified confidence scores indistinguishable in individual shape, statistical distribution and prediction label between members and non-members. The experimental results show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency, outperforming previous defense methods, and also incurs negligible utility loss. Besides, our further experiments show that PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks. For example, the inversion error is raised about 4+ times on the Facescrub530 classifier, and the attribute inference accuracy drops significantly when PURIFIER is deployed in our experiment.

translated by 谷歌翻译

A Representation Modeling Based Language GAN with Completely Random Initialization

Da Ren , Qing Li

分类：自然语言处理

2022-08-04

通过最大似然估计（MLE）训练的文本生成模型遭受了臭名昭著的暴露偏见问题，而生成的对抗网络（GAN）被证明具有解决方案的潜力。现有的语言gans采用估计器，例如增强或连续放松来模型单词分布。此类估计器的固有局限性导致当前模型依赖于预训练技术（MLE预训练或预训练的嵌入）。但是，由于其先前尝试的性能较差，因此很少探索没有这些局限性的代表建模方法。我们的分析表明，无效的采样方法和不健康的梯度是其不令人满意的性能的主要因素。在这项工作中，我们提出了两种解决这些问题的技术：辍学抽样和完全归一化的LSTM。基于这两种技术，我们提出了初始gan，其参数是完全初始初始初始初始初始初始初始初始化的。此外，我们引入了新的评估度量，覆盖率最少，以更好地评估生成的样品的质量。实验结果表明，Initialgan的表现都优于MLE和其他比较模型。据我们所知，这是GAN语言第一次在没有任何预训练技术的情况下胜过MLE。

translated by 谷歌翻译

DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

Guangyao Zhai , Yu Zheng , Ziwei Xu , Xin Kong , Yong Liu , Benjamin Busam , Yi Ren , Nassir Navab , Zhengyou Zhang

分类：机器人 | 计算机视觉

2022-07-31

在本文中，我们介绍了DA $^2 $，这是第一个大型双臂灵敏性吸引数据集，用于生成最佳的双人握把对，用于任意大型对象。该数据集包含大约900万的平行jaw grasps，由6000多个对象生成，每个对象都有各种抓紧敏度度量。此外，我们提出了一个端到端的双臂掌握评估模型，该模型在该数据集的渲染场景上训练。我们利用评估模型作为基准，通过在线分析和真实的机器人实验来显示这一新颖和非平凡数据集的价值。所有数据和相关的代码将在https://sites.google.com/view/da2dataset上开源。

translated by 谷歌翻译

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

Sucheng Ren , Fangyun Wei , Zheng Zhang , Han Hu

分类：计算机视觉

2023-01-03

Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.

translated by 谷歌翻译

A Global Optimization Algorithm for K-Center Clustering of One Billion Samples

Jiayang Ren , Ningning You , Kaixun Hua , Chaojie Ji , Yankai Cao

分类：机器学习

2022-12-30

This paper presents a practical global optimization algorithm for the K-center clustering problem, which aims to select K samples as the cluster centers to minimize the maximum within-cluster distance. This algorithm is based on a reduced-space branch and bound scheme and guarantees convergence to the global optimum in a finite number of steps by only branching on the regions of centers. To improve efficiency, we have designed a two-stage decomposable lower bound, the solution of which can be derived in a closed form. In addition, we also propose several acceleration techniques to narrow down the region of centers, including bounds tightening, sample reduction, and parallelization. Extensive studies on synthetic and real-world datasets have demonstrated that our algorithm can solve the K-center problems to global optimal within 4 hours for ten million samples in the serial mode and one billion samples in the parallel mode. Moreover, compared with the state-of-the-art heuristic methods, the global optimum obtained by our algorithm can averagely reduce the objective function by 25.8% on all the synthetic and real-world datasets.

translated by 谷歌翻译

Exploring Vision Transformers as Diffusion Learners

He Cao , Jianan Wang , Tianhe Ren , Xianbiao Qi , Yihao Chen , Yuan Yao , Lei Zhang

分类：计算机视觉

2022-12-28

Score-based diffusion models have captured widespread attention and funded fast progress of recent vision generative tasks. In this paper, we focus on diffusion model backbone which has been much neglected before. We systematically explore vision Transformers as diffusion learners for various generative tasks. With our improvements the performance of vanilla ViT-based backbone (IU-ViT) is boosted to be on par with traditional U-Net-based methods. We further provide a hypothesis on the implication of disentangling the generative backbone as an encoder-decoder structure and show proof-of-concept experiments verifying the effectiveness of a stronger encoder for generative tasks with ASymmetriC ENcoder Decoder (ASCEND). Our improvements achieve competitive results on CIFAR-10, CelebA, LSUN, CUB Bird and large-resolution text-to-image tasks. To the best of our knowledge, we are the first to successfully train a single diffusion model on text-to-image task beyond 64x64 resolution. We hope this will motivate people to rethink the modeling choices and the training pipelines for diffusion-based generative models.

translated by 谷歌翻译

Unpaired Overwater Image Defogging Using Prior Map Guided CycleGAN

Yaozong Mo , Chaofeng Li , Wenqi Ren , Shaopeng Shang , Wenwu Wang , Xiao-jun Wu

分类：计算机视觉 | 人工智能

2022-12-23

Deep learning-based methods have achieved significant performance for image defogging. However, existing methods are mainly developed for land scenes and perform poorly when dealing with overwater foggy images, since overwater scenes typically contain large expanses of sky and water. In this work, we propose a Prior map Guided CycleGAN (PG-CycleGAN) for defogging of images with overwater scenes. To promote the recovery of the objects on water in the image, two loss functions are exploited for the network where a prior map is designed to invert the dark channel and the min-max normalization is used to suppress the sky and emphasize objects. However, due to the unpaired training set, the network may learn an under-constrained domain mapping from foggy to fog-free image, leading to artifacts and loss of details. Thus, we propose an intuitive Upscaling Inception Module (UIM) and a Long-range Residual Coarse-to-fine framework (LRC) to mitigate this issue. Extensive experiments on qualitative and quantitative comparisons demonstrate that the proposed method outperforms the state-of-the-art supervised, semi-supervised, and unsupervised defogging approaches.

translated by 谷歌翻译

Variational Reasoning over Incomplete Knowledge Graphs for Conversational Recommendation

Xiaoyu Zhang , Xin Xin , Dongdong Li , Wenxuan Liu , Pengjie Ren , Zhumin Chen , Jun Ma , Zhaochun Ren

分类：人工智能

2022-12-22

Conversational recommender systems (CRSs) often utilize external knowledge graphs (KGs) to introduce rich semantic information and recommend relevant items through natural language dialogues. However, original KGs employed in existing CRSs are often incomplete and sparse, which limits the reasoning capability in recommendation. Moreover, only few of existing studies exploit the dialogue context to dynamically refine knowledge from KGs for better recommendation. To address the above issues, we propose the Variational Reasoning over Incomplete KGs Conversational Recommender (VRICR). Our key idea is to incorporate the large dialogue corpus naturally accompanied with CRSs to enhance the incomplete KGs; and perform dynamic knowledge reasoning conditioned on the dialogue context. Specifically, we denote the dialogue-specific subgraphs of KGs as latent variables with categorical priors for adaptive knowledge graphs refactor. We propose a variational Bayesian method to approximate posterior distributions over dialogue-specific subgraphs, which not only leverages the dialogue corpus for restructuring missing entity relations but also dynamically selects knowledge based on the dialogue context. Finally, we infuse the dialogue-specific subgraphs to decode the recommendation and responses. We conduct experiments on two benchmark CRSs datasets. Experimental results confirm the effectiveness of our proposed method.

translated by 谷歌翻译

DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders

Xiaoyang Kang , Tao Yang , Wenqi Ouyang , Peiran Ren , Lingzhi Li , Xuansong Xie

分类：计算机视觉

2022-12-22

Automatic image colorization is a particularly challenging problem. Due to the high illness of the problem and multi-modal uncertainty, directly training a deep neural network usually leads to incorrect semantic colors and low color richness. Existing transformer-based methods can deliver better results but highly depend on hand-crafted dataset-level empirical distribution priors. In this work, we propose DDColor, a new end-to-end method with dual decoders, for image colorization. More specifically, we design a multi-scale image decoder and a transformer-based color decoder. The former manages to restore the spatial resolution of the image, while the latter establishes the correlation between semantic representations and color queries via cross-attention. The two decoders incorporate to learn semantic-aware color embedding by leveraging the multi-scale visual features. With the help of these two decoders, our method succeeds in producing semantically consistent and visually plausible colorization results without any additional priors. In addition, a simple but effective colorfulness loss is introduced to further improve the color richness of generated results. Our extensive experiments demonstrate that the proposed DDColor achieves significantly superior performance to existing state-of-the-art works both quantitatively and qualitatively. Codes will be made publicly available.

translated by 谷歌翻译

Trajectory Generation and Tracking Control for Aggressive Tail-Sitter Flights

Guozheng Lu , Yixi Cai , Nan Chen , Fanze Kong , Yunfan Ren , Fu Zhang

分类：机器人

2022-12-22

We address the theoretical and practical problems related to the trajectory generation and tracking control of tail-sitter UAVs. Theoretically, we focus on the differential flatness property with full exploitation of actual UAV aerodynamic models, which lays a foundation for generating dynamically feasible trajectory and achieving high-performance tracking control. We have found that a tail-sitter is differentially flat with accurate aerodynamic models within the entire flight envelope, by specifying coordinate flight condition and choosing the vehicle position as the flat output. This fundamental property allows us to fully exploit the high-fidelity aerodynamic models in the trajectory planning and tracking control to achieve accurate tail-sitter flights. Particularly, an optimization-based trajectory planner for tail-sitters is proposed to design high-quality, smooth trajectories with consideration of kinodynamic constraints, singularity-free constraints and actuator saturation. The planned trajectory of flat output is transformed to state trajectory in real-time with consideration of wind in environments. To track the state trajectory, a global, singularity-free, and minimally-parameterized on-manifold MPC is developed, which fully leverages the accurate aerodynamic model to achieve high-accuracy trajectory tracking within the whole flight envelope. The effectiveness of the proposed framework is demonstrated through extensive real-world experiments in both indoor and outdoor field tests, including agile SE(3) flight through consecutive narrow windows requiring specific attitude and with speed up to 10m/s, typical tail-sitter maneuvers (transition, level flight and loiter) with speed up to 20m/s, and extremely aggressive aerobatic maneuvers (Wingover, Loop, Vertical Eight and Cuban Eight) with acceleration up to 2.5g.

translated by 谷歌翻译