3D对象的点云具有固有的组成性质,可以将简单的部分组装成逐渐复杂的形状以形成整个对象。明确捕获这种部分整体层次结构是一个长期的目标,以建立有效的模型,但其树状的性质使这项任务变得难以捉摸。在本文中,我们建议将点云分类器的特征嵌入双曲线空间中,并明确规范空间以说明零件整体结构。双曲线空间是唯一可以成功嵌入层次结构的树状性质的空间。这导致了对点云分类的最先进的监督模型的性能的实质性改善。
translated by 谷歌翻译
在本文中,我们在辅助图像的指导下探讨了点云完成的最新主题。我们展示了如何在局部潜在空间中有效地结合两种方式中的信息,从而避免了对最新的单个视图中复杂点云重建方法的需求。我们还研究了一种新颖的弱监督设置,其中辅助图像通过在完整的点云上使用可区分的渲染器来测量图像空间中的保真度,从而为训练过程提供了监督信号。实验显示了对单峰和多模式完成的最新监督方法的显着改善。我们还展示了弱监督的方法的有效性,该方法的表现优于许多监督方法,并且与最新监督模型仅利用点云信息具有竞争力。
translated by 谷歌翻译
反问题包括从不完整的测量集重建信号,其性能高度取决于通过正则化编码的先验知识的质量。尽管传统方法着重于获得独特的解决方案,但新兴趋势考虑了探索多种临时解决方案。在本文中,我们提出了一种生成多个重建的方法,该重建既适合测量值,又是由生成对抗网络学到的数据驱动的先验。特别是,我们表明,从初始解决方案开始,可以在生成模型的潜在空间中找到对远期操作员无效的方向,从而与测量值保持一致,同时诱发显着的感知变化。我们的探索方法允许为反问题生成多个解决方案,比现有方法快的数量级。我们显示了图像超分辨率和介入问题的结果。
translated by 谷歌翻译
半监督的学习技术由于其有效的建筑模型能力,即使有稀缺的标记数据可用,它们也在受欢迎程度。在本文中,我们提出了一个框架和特定任务,用于\ textit {multichannel}模型的自我监督预处理,例如多光谱和合成孔径雷达图像的融合。我们表明,拟议的自我监督方法非常有效地学习与土地覆盖分类标签相关的特征。这是通过预处理任务的明确设计来实现的,该任务促进了感应方式之间的差距和利用输入的光谱特征。在半监督的环境中,如果有限的标签可用,则使用拟议的自我监督预审议,然后使用SAR和多光谱数据进行监督的填充,以进行土地覆盖分类,以优于纯粹监督的学习,例如纯监督的学习,来自Imagenet和ImageNet和Imagenet和Imagenet和Imagenet和Imagenet和ImageNet培训的初始化其他最近的自我监督方法。
translated by 谷歌翻译
Numerous works use word embedding-based metrics to quantify societal biases and stereotypes in texts. Recent studies have found that word embeddings can capture semantic similarity but may be affected by word frequency. In this work we study the effect of frequency when measuring female vs. male gender bias with word embedding-based bias quantification methods. We find that Skip-gram with negative sampling and GloVe tend to detect male bias in high frequency words, while GloVe tends to return female bias in low frequency words. We show these behaviors still exist when words are randomly shuffled. This proves that the frequency-based effect observed in unshuffled corpora stems from properties of the metric rather than from word associations. The effect is spurious and problematic since bias metrics should depend exclusively on word co-occurrences and not individual word frequencies. Finally, we compare these results with the ones obtained with an alternative metric based on Pointwise Mutual Information. We find that this metric does not show a clear dependence on frequency, even though it is slightly skewed towards male bias across all frequencies.
translated by 谷歌翻译
This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Universit\'e de Montr\'eal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).
translated by 谷歌翻译
Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. Other approaches formulated the problem as a Markov Decision Process (MDP) and applied reinforcement learning based algorithms. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search. With Proximal Policy Optimization and Deep Q-Network algorithms as benchmark, we show the effectiveness of our proposed approach with experimental study.
translated by 谷歌翻译
360-degree panoramic videos have gained considerable attention in recent years due to the rapid development of head-mounted displays (HMDs) and panoramic cameras. One major problem in streaming panoramic videos is that panoramic videos are much larger in size compared to traditional ones. Moreover, the user devices are often in a wireless environment, with limited battery, computation power, and bandwidth. To reduce resource consumption, researchers have proposed ways to predict the users' viewports so that only part of the entire video needs to be transmitted from the server. However, the robustness of such prediction approaches has been overlooked in the literature: it is usually assumed that only a few models, pre-trained on past users' experiences, are applied for prediction to all users. We observe that those pre-trained models can perform poorly for some users because they might have drastically different behaviors from the majority, and the pre-trained models cannot capture the features in unseen videos. In this work, we propose a novel meta learning based viewport prediction paradigm to alleviate the worst prediction performance and ensure the robustness of viewport prediction. This paradigm uses two machine learning models, where the first model predicts the viewing direction, and the second model predicts the minimum video prefetch size that can include the actual viewport. We first train two meta models so that they are sensitive to new training data, and then quickly adapt them to users while they are watching the videos. Evaluation results reveal that the meta models can adapt quickly to each user, and can significantly increase the prediction accuracy, especially for the worst-performing predictions.
translated by 谷歌翻译
This paper presents a corpus annotated for the task of direct-speech extraction in Croatian. The paper focuses on the annotation of the quotation, co-reference resolution, and sentiment annotation in SETimes news corpus in Croatian and on the analysis of its language-specific differences compared to English. From this, a list of the phenomena that require special attention when performing these annotations is derived. The generated corpus with quotation features annotations can be used for multiple tasks in the field of Natural Language Processing.
translated by 谷歌翻译
With the ever-growing popularity of the field of NLP, the demand for datasets in low resourced-languages follows suit. Following a previously established framework, in this paper, we present the UNER dataset, a multilingual and hierarchical parallel corpus annotated for named-entities. We describe in detail the developed procedure necessary to create this type of dataset in any language available on Wikipedia with DBpedia information. The three-step procedure extracts entities from Wikipedia articles, links them to DBpedia, and maps the DBpedia sets of classes to the UNER labels. This is followed by a post-processing procedure that significantly increases the number of identified entities in the final results. The paper concludes with a statistical and qualitative analysis of the resulting dataset.
translated by 谷歌翻译