在水下图像中物体的外观通过选择性衰减而降低,从而减少对比度并导致颜色铸造。这种降解取决于水环境,并随着物体与摄像机的距离而增加。尽管水下图像增强和恢复中的作品数量增加,但缺乏普遍接受的评估措施正在阻碍进度,因为很难比较方法。在本文中,我们审查了常用的色彩精度度量,例如颜色复制误差和CIEDE2000,以及无引用的图像质量度量,例如UIQM,UCIQE和CCF,尚未系统地验证。我们表明,没有一项无参考质量措施令人满意地评估增强的水下图像的质量并讨论其主要缺点。图像和结果可在https://puiqe.eecs.qmul.ac.uk上找到。
translated by 谷歌翻译
车辆重新识别(RE-ID)旨在通过不同的摄像机检索具有相同车辆ID的图像。当前的零件级特征学习方法通​​常通过统一的部门,外部工具或注意力建模来检测车辆零件。但是,此部分功能通常需要昂贵的额外注释,并在不可靠的零件遮罩预测的情况下导致次优性能。在本文中,我们提出了一个针对车辆重新ID的弱监督零件注意网络(Panet)和零件式网络(PMNET)。首先,Panet通过与零件相关的通道重新校准和基于群集的掩模生成无需车辆零件监管信息来定位车辆零件。其次,PMNET利用教师指导的学习来从锅et中提取特定于车辆的特定功能,并进行多尺度的全球零件特征提取。在推断过程中,PMNET可以自适应提取歧视零件特征,而无需围绕锅et定位,从而防止了不稳定的零件掩模预测。我们将重新ID问题作为一个多任务问题,并采用同质的不确定性来学习最佳的ID损失权衡。实验是在两个公共基准上进行的,这表明我们的方法优于最近的方法,这不需要额外的注释,即CMC@5的平均增加3.0%,而Veri776的MAP中不需要超过1.4%。此外,我们的方法可以扩展到遮挡的车辆重新ID任务,并具有良好的概括能力。
translated by 谷歌翻译
诸如自然灾害,全球大流行和社会动荡等危机不断威胁到我们的世界,并以不同的方式影响了全世界的数百万人。了解人们在大规模危机期间表达的情绪有助于告知政策制定者和急救人员有关人口的情绪状态,并为需要这种支持的人提供情感支持。我们介绍了Covidemo,〜3K英语推文标有情感,并在18个月内分布时间。我们的分析揭示了Covid-19造成的情感损失,以及随着时间的推移社会叙事和相关情绪的变化。由危机的时间敏感性和大规模注释努力的成本的激励,我们研究了在Covid-19的感知情绪预测的任务中,大型的预训练的语言模型在跨领域和时间表中的范围很好。我们的分析表明,跨域信息传输发生,但仍然存在很大的差距。我们提出了半监督的学习,作为弥合这一差距的一种方式,使用来自目标域的未标记数据获得了明显更好的性能。
translated by 谷歌翻译
Terahertz频段(0.1---10 THZ)中的无线通信被视为未来第六代(6G)无线通信系统的关键促进技术之一,超出了大量多重输入多重输出(大量MIMO)技术。但是,THZ频率的非常高的传播衰减和分子吸收通常限制了信号传输距离和覆盖范围。从最近在可重构智能表面(RIS)上实现智能无线电传播环境的突破,我们为多跳RIS RIS辅助通信网络提供了一种新型的混合波束形成方案,以改善THZ波段频率的覆盖范围。特别是,部署了多个被动和可控的RIS,以协助基站(BS)和多个单人体用户之间的传输。我们通过利用最新的深钢筋学习(DRL)来应对传播损失的最新进展,研究了BS在BS和RISS上的模拟光束矩阵的联合设计。为了改善拟议的基于DRL的算法的收敛性,然后设计了两种算法,以初始化数字波束形成和使用交替优化技术的模拟波束形成矩阵。仿真结果表明,与基准相比,我们提出的方案能够改善50 \%的THZ通信范围。此外,还表明,我们提出的基于DRL的方法是解决NP-固定光束形成问题的最先进方法,尤其是当RIS辅助THZ通信网络的信号经历多个啤酒花时。
translated by 谷歌翻译
由于其快速和低功率配置,可重新配置的智能表面(RISS)最近被视为未来无线网络的节能解决方案,这在实现大规模连通性和低延迟通信方面具有增加的潜力。基于RIS的系统中的准确且低空的通道估计是通常的RIS单元元素及其独特的硬件约束,这是最关键的挑战之一。在本文中,我们专注于RIS授权的多用户多用户多输入单输出(MISO)上行链路通信系统的上行链路,并根据并行因子分解提出了一个通道估计框架,以展开所得的级联通道模型。我们为基站和RIS之间的渠道以及RIS与用户之间的渠道提供了两种迭代估计算法。一个基于交替的最小二乘(ALS),而另一个使用向量近似消息传递到迭代的迭代中,从估计的向量重建了两个未知的通道。为了从理论上评估基于ALS的算法的性能,我们得出了其估计值CRAM \'ER-RAO BOND(CRB)。我们还通过估计的通道和基本站的不同预码方案讨论了可实现的总和率计算。我们的广泛仿真结果表明,我们的算法表现优于基准方案,并且ALS技术可实现CRB。还证明,使用估计通道的总和率总是在各种设置下达到完美通道的总和,从而验证了提出的估计算法的有效性和鲁棒性。
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Deep learning has been widely used for protein engineering. However, it is limited by the lack of sufficient experimental data to train an accurate model for predicting the functional fitness of high-order mutants. Here, we develop SESNet, a supervised deep-learning model to predict the fitness for protein mutants by leveraging both sequence and structure information, and exploiting attention mechanism. Our model integrates local evolutionary context from homologous sequences, the global evolutionary context encoding rich semantic from the universal protein sequence space and the structure information accounting for the microenvironment around each residue in a protein. We show that SESNet outperforms state-of-the-art models for predicting the sequence-function relationship on 26 deep mutational scanning datasets. More importantly, we propose a data augmentation strategy by leveraging the data from unsupervised models to pre-train our model. After that, our model can achieve strikingly high accuracy in prediction of the fitness of protein mutants, especially for the higher order variants (> 4 mutation sites), when finetuned by using only a small number of experimental mutation data (<50). The strategy proposed is of great practical value as the required experimental effort, i.e., producing a few tens of experimental mutation data on a given protein, is generally affordable by an ordinary biochemical group and can be applied on almost any protein.
translated by 谷歌翻译
Future work sentences (FWS) are the particular sentences in academic papers that contain the author's description of their proposed follow-up research direction. This paper presents methods to automatically extract FWS from academic papers and classify them according to the different future directions embodied in the paper's content. FWS recognition methods will enable subsequent researchers to locate future work sentences more accurately and quickly and reduce the time and cost of acquiring the corpus. The current work on automatic identification of future work sentences is relatively small, and the existing research cannot accurately identify FWS from academic papers, and thus cannot conduct data mining on a large scale. Furthermore, there are many aspects to the content of future work, and the subdivision of the content is conducive to the analysis of specific development directions. In this paper, Nature Language Processing (NLP) is used as a case study, and FWS are extracted from academic papers and classified into different types. We manually build an annotated corpus with six different types of FWS. Then, automatic recognition and classification of FWS are implemented using machine learning models, and the performance of these models is compared based on the evaluation metrics. The results show that the Bernoulli Bayesian model has the best performance in the automatic recognition task, with the Macro F1 reaching 90.73%, and the SCIBERT model has the best performance in the automatic classification task, with the weighted average F1 reaching 72.63%. Finally, we extract keywords from FWS and gain a deep understanding of the key content described in FWS, and we also demonstrate that content determination in FWS will be reflected in the subsequent research work by measuring the similarity between future work sentences and the abstracts.
translated by 谷歌翻译
Reinforcement learning (RL) problems can be challenging without well-shaped rewards. Prior work on provably efficient RL methods generally proposes to address this issue with dedicated exploration strategies. However, another way to tackle this challenge is to reformulate it as a multi-task RL problem, where the task space contains not only the challenging task of interest but also easier tasks that implicitly function as a curriculum. Such a reformulation opens up the possibility of running existing multi-task RL methods as a more efficient alternative to solving a single challenging task from scratch. In this work, we provide a theoretical framework that reformulates a single-task RL problem as a multi-task RL problem defined by a curriculum. Under mild regularity conditions on the curriculum, we show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies. We also show that our theoretical insights can be translated into an effective practical learning algorithm that can accelerate curriculum learning on simulated robotic tasks.
translated by 谷歌翻译
Blind watermarking provides powerful evidence for copyright protection, image authentication, and tampering identification. However, it remains a challenge to design a watermarking model with high imperceptibility and robustness against strong noise attacks. To resolve this issue, we present a framework Combining the Invertible and Non-invertible (CIN) mechanisms. The CIN is composed of the invertible part to achieve high imperceptibility and the non-invertible part to strengthen the robustness against strong noise attacks. For the invertible part, we develop a diffusion and extraction module (DEM) and a fusion and split module (FSM) to embed and extract watermarks symmetrically in an invertible way. For the non-invertible part, we introduce a non-invertible attention-based module (NIAM) and the noise-specific selection module (NSM) to solve the asymmetric extraction under a strong noise attack. Extensive experiments demonstrate that our framework outperforms the current state-of-the-art methods of imperceptibility and robustness significantly. Our framework can achieve an average of 99.99% accuracy and 67.66 dB PSNR under noise-free conditions, while 96.64% and 39.28 dB combined strong noise attacks. The code will be available in https://github.com/rmpku/CIN.
translated by 谷歌翻译