智能论文笔记

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement

Pavel Andreev , Aibek Alanov , Oleg Ivanov , Dmitry Vetrov

分类：机器学习

2022-03-24

生成的对抗网络最近在神经声音中表现出了出色的表现，表现优于最佳自动回归和基于流动的模型。在本文中，我们表明这种成功可以扩展到有条件音频的其他任务。特别是，在HIFI Vocoders的基础上，我们为带宽扩展和语音增强的新型HIFI ++一般框架提出了新颖的一般框架。我们表明，通过改进的生成器体系结构和简化的多歧视培训，HIFI ++在这些任务中的最先进的情况下表现更好或与之相提并论，同时花费大量的计算资源。通过一系列广泛的实验，我们的方法的有效性得到了验证。

translated by 谷歌翻译

FMM-Net: neural network architecture based on the Fast Multipole Method

Daria Sushnikova , Pavel Kharyuk , Ivan Oseledets

分类：人工智能 | 机器学习

2022-12-25

In this paper, we propose a new neural network architecture based on the H2 matrix. Even though networks with H2-inspired architecture already exist, and our approach is designed to reduce memory costs and improve performance by taking into account the sparsity template of the H2 matrix. In numerical comparison with alternative neural networks, including the known H2-based ones, our architecture showed itself as beneficial in terms of performance, memory, and scalability.

translated by 谷歌翻译

Accelerating Barnes-Hut t-SNE Algorithm by Efficient Parallelization on Multi-Core CPUs

Narendra Chaudhary , Alexander Pivovar , Pavel Yakovlev , Andrey Gorshkov , Sanchit Misra

分类：机器学习 | 人工智能

2022-12-22

t-SNE remains one of the most popular embedding techniques for visualizing high-dimensional data. Most standard packages of t-SNE, such as scikit-learn, use the Barnes-Hut t-SNE (BH t-SNE) algorithm for large datasets. However, existing CPU implementations of this algorithm are inefficient. In this work, we accelerate the BH t-SNE on CPUs via cache optimizations, SIMD, parallelizing sequential steps, and improving parallelization of multithreaded steps. Our implementation (Acc-t-SNE) is up to 261x and 4x faster than scikit-learn and the state-of-the-art BH t-SNE implementation from daal4py, respectively, on a 32-core Intel(R) Icelake cloud instance.

translated by 谷歌翻译

Image quality prediction using synthetic and natural codebooks: comparative results

Maxim Koroteev , Kirill Aistov , Valeriy Berezovskiy , Pavel Frolov

分类：计算机视觉

2022-12-20

We investigate a model for image/video quality assessment based on building a set of codevectors representing in a sense some basic properties of images, similar to well-known CORNIA model. We analyze the codebook building method and propose some modifications for it. Also the algorithm is investigated from the point of inference time reduction. Both natural and synthetic images are used for building codebooks and some analysis of synthetic images used for codebooks is provided. It is demonstrated the results on quality assessment may be improves with the use if synthetic images for codebook construction. We also demonstrate regimes of the algorithm in which real time execution on CPU is possible for sufficiently high correlations with mean opinion score (MOS). Various pooling strategies are considered as well as the problem of metric sensitivity to bitrate.

translated by 谷歌翻译

Assign Experiment Variants at Scale in Online Controlled Experiments

Qike Li , Samir Jamkhande , Pavel Kochetkov , Pai Liu

分类：机器学习

2022-12-17

Online controlled experiments (A/B tests) have become the gold standard for learning the impact of new product features in technology companies. Randomization enables the inference of causality from an A/B test. The randomized assignment maps end users to experiment buckets and balances user characteristics between the groups. Therefore, experiments can attribute any outcome differences between the experiment groups to the product feature under experiment. Technology companies run A/B tests at scale -- hundreds if not thousands of A/B tests concurrently, each with millions of users. The large scale poses unique challenges to randomization. First, the randomized assignment must be fast since the experiment service receives hundreds of thousands of queries per second. Second, the variant assignments must be independent between experiments. Third, the assignment must be consistent when users revisit or an experiment enrolls more users. We present a novel assignment algorithm and statistical tests to validate the randomized assignments. Our results demonstrate that not only is this algorithm computationally fast but also satisfies the statistical requirements -- unbiased and independent.

translated by 谷歌翻译

FlexiViT: One Model for All Patch Sizes

Lucas Beyer , Pavel Izmailov , Alexander Kolesnikov , Mathilde Caron , Simon Kornblith , Xiaohua Zhai , Matthias Minderer , Michael Tschannen , Ibrahim Alabdulmohsin , Filip Pavetic

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-15

Vision Transformers convert images to sequences by slicing them into patches. The size of these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher accuracy at greater computational cost, but changing the patch size typically requires retraining the model. In this paper, we demonstrate that simply randomizing the patch size at training time leads to a single set of weights that performs well across a wide range of patch sizes, making it possible to tailor the model to different compute budgets at deployment time. We extensively evaluate the resulting model, which we call FlexiViT, on a wide range of tasks, including classification, image-text retrieval, open-world detection, panoptic segmentation, and semantic segmentation, concluding that it usually matches, and sometimes outperforms, standard ViT models trained at a single patch size in an otherwise identical setup. Hence, FlexiViT training is a simple drop-in improvement for ViT that makes it easy to add compute-adaptive capabilities to most models relying on a ViT backbone architecture. Code and pre-trained models are available at https://github.com/google-research/big_vision

translated by 谷歌翻译

Heuristically Guided Compilation for Multi-Agent Path Finding

Pavel Surynek

分类：人工智能

2022-12-13

Multi-agent path finding (MAPF) is a task of finding non-conflicting paths connecting agents' specified initial and goal positions in a shared environment. We focus on compilation-based solvers in which the MAPF problem is expressed in a different well established formalism such as mixed-integer linear programming (MILP), Boolean satisfiability (SAT), or constraint programming (CP). As the target solvers for these formalisms act as black-boxes it is challenging to integrate MAPF specific heuristics in the MAPF compilation-based solvers. We show in this work how the build a MAPF encoding for the target SAT solver in which domain specific heuristic knowledge is reflected. The heuristic knowledge is transferred to the SAT solver by selecting candidate paths for each agent and by constructing the encoding only for these candidate paths instead of constructing the encoding for all possible paths for an agent. The conducted experiments show that heuristically guided compilation outperforms the vanilla variants of the SAT-based MAPF solver.

translated by 谷歌翻译

Breaking the "Object" in Video Object Segmentation

Pavel Tokmakov , Jie Li , Adrien Gaidon

分类：计算机视觉

2022-12-12

The appearance of an object can be fleeting when it transforms. As eggs are broken or paper is torn, their color, shape and texture can change dramatically, preserving virtually nothing of the original except for the identity itself. Yet, this important phenomenon is largely absent from existing video object segmentation (VOS) benchmarks. In this work, we close the gap by collecting a new dataset for Video Object Segmentation under Transformations (VOST). It consists of more than 700 high-resolution videos, captured in diverse environments, which are 20 seconds long on average and densely labeled with instance masks. A careful, multi-step approach is adopted to ensure that these videos focus on complex object transformations, capturing their full temporal extent. We then extensively evaluate state-of-the-art VOS methods and make a number of important discoveries. In particular, we show that existing methods struggle when applied to this novel task and that their main limitation lies in over-reliance on static appearance cues. This motivates us to propose a few modifications for the top-performing baseline that improve its capabilities by better modeling spatio-temporal information. But more broadly, the hope is to stimulate discussion on learning more robust video object representations.

translated by 谷歌翻译

Unsupervised Anomaly Detection in Time-series: An Extensive Evaluation and Analysis of State-of-the-art Methods

Nesryne Mejri , Laura Lopez-Fuentes , Kankana Roy , Pavel Chernakov , Enjie Ghorbel , Djamila Aouada

分类：机器学习

2022-12-06

Unsupervised anomaly detection in time-series has been extensively investigated in the literature. Notwithstanding the relevance of this topic in numerous application fields, a complete and extensive evaluation of recent state-of-the-art techniques is still missing. Few efforts have been made to compare existing unsupervised time-series anomaly detection methods rigorously. However, only standard performance metrics, namely precision, recall, and F1-score are usually considered. Essential aspects for assessing their practical relevance are therefore neglected. This paper proposes an original and in-depth evaluation study of recent unsupervised anomaly detection techniques in time-series. Instead of relying solely on standard performance metrics, additional yet informative metrics and protocols are taken into account. In particular, (1) more elaborate performance metrics specifically tailored for time-series are used; (2) the model size and the model stability are studied; (3) an analysis of the tested approaches with respect to the anomaly type is provided; and (4) a clear and unique protocol is followed for all experiments. Overall, this extensive analysis aims to assess the maturity of state-of-the-art time-series anomaly detection, give insights regarding their applicability under real-world setups and provide to the community a more complete evaluation protocol.

translated by 谷歌翻译

An exponentially-growing family of universal quantum circuits

Mohammad Kordzanganeh , Pavel Sekatski , Leonid Fedichkin , Alexey Melnikov

分类：机器学习

2022-12-01

Quantum machine learning has become an area of growing interest but has certain theoretical and hardware-specific limitations. Notably, the problem of vanishing gradients, or barren plateaus, renders the training impossible for circuits with high qubit counts, imposing a limit on the number of qubits that data scientists can use for solving problems. Independently, angle-embedded supervised quantum neural networks were shown to produce truncated Fourier series with a degree directly dependent on two factors: the depth of the encoding, and the number of parallel qubits the encoding is applied to. The degree of the Fourier series limits the model expressivity. This work introduces two new architectures whose Fourier degrees grow exponentially: the sequential and parallel exponential quantum machine learning architectures. This is done by efficiently using the available Hilbert space when encoding, increasing the expressivity of the quantum encoding. Therefore, the exponential growth allows staying at the low-qubit limit to create highly expressive circuits avoiding barren plateaus. Practically, parallel exponential architecture was shown to outperform the existing linear architectures by reducing their final mean square error value by up to 44.7% in a one-dimensional test problem. Furthermore, the feasibility of this technique was also shown on a trapped ion quantum processing unit.

translated by 谷歌翻译