为了响应现有的对象检测算法,应用于复杂的火灾方案,检测准确性较差,速度缓慢和困难的部署。本文提出了轻巧的火灾检测算法,可实现速度和准确性的平衡。首先,骨干网络的最后一层被SEPVIT块取代,以增强骨干网络与全局信息的联系;其次,轻型BIFPN颈网旨在减轻模型,同时改善特征提取。第三,全球注意机制(GAM)融合到网络中,以使模型更加专注于全球维度特征。最后,我们使用Mish激活函数和SIOU损失来提高收敛速度并同时提高准确性。实验结果表明,与原始算法相比,Light-Yolov5将MAP提高3.3%,将参数数量减少27.1%,将计算减少19.1%,达到91.1的FPS。即使与最新的Yolov7微型相比,Light-Yolov5的地图也高6.8%,这显示了该算法的有效性。
translated by 谷歌翻译
域的适应性(DA)旨在将知识从标记的源域中学习的知识转移到未标记或标记较小但相关的目标域的知识。理想情况下,源和目标分布应彼此平等地对齐,以实现公正的知识转移。但是,由于源和目标域中注释数据的数量之间存在显着不平衡,通常只有目标分布与源域保持一致,从而使不必要的源特定知识适应目标域,即偏置域的适应性。为了解决此问题,在这项工作中,我们通过对基于对抗性的DA方法进行建模来对歧视器的不确定性进行建模,以优化无偏见转移。我们理论上分析了DA中提出的无偏可传递性学习方法的有效性。此外,为了减轻注释数据不平衡的影响,我们利用了目标域中未标记样品的伪标签选择的估计不确定性,这有助于实现更好的边际和条件分布在域之间的分布。对各种DA基准数据集的广泛实验结果表明,可以轻松地将所提出的方法纳入各种基于对抗性的DA方法中,从而实现最新的性能。
translated by 谷歌翻译
轻巧的超级分辨率(SR)模型因其在移动设备中的可用性而受到了极大的关注。许多努力采用网络量化来压缩SR模型。但是,当将SR模型定量为具有低成本层量化的超低精度(例如2位和3位)时,这些方法会遭受严重的性能降解。在本文中,我们确定性能下降来自于层的对称量化器与SR模型中高度不对称的激活分布之间的矛盾。这种差异导致量化水平上的浪费或重建图像中的细节损失。因此,我们提出了一种新型的激活量化器,称为动态双训练边界(DDTB),以适应激活的不对称性。具体而言,DDTB在:1)具有可训练上限和下限的层量化器中,以应对高度不对称的激活。 2)一个动态栅极控制器,可在运行时自适应地调整上和下限,以克服不同样品上的急剧变化的激活范围。为了减少额外的开销,将动态栅极控制器定量到2位,并仅应用于部分的一部分SR网络根据引入的动态强度。广泛的实验表明,我们的DDTB在超低精度方面表现出显着的性能提高。例如,当将EDSR量化为2位并将输出图像扩展为X4时,我们的DDTB在Urban100基准测试基准上实现了0.70dB PSNR的增加。代码位于\ url {https://github.com/zysxmu/ddtb}。
translated by 谷歌翻译
这封信提供了在沟通限制下进行多机器人探索的完整框架会议 - 结合措施。考虑到沟通在现实世界中的带宽和范围都受到限制,我们提出了一种轻巧的环境演示方法和有效的合作探索策略。对于较低的带宽,每个机器人都利用特定的多面有来维护自由空间和超级边界信息(SFI)作为勘探决策的来源。为了减少重复的探索,我们开发了一种基于任务的协议,该协议驱动机器人以稳定的会合方式共享收集的信息。我们还为集中式和分散案件设计了完整的路径计划计划。为了验证我们的框架是实用且通用的,我们提出了广泛的基准,并将系统部署到多UGV和多UAV平台中。
translated by 谷歌翻译
虽然训练后量化受到普及,但由于其逃避访问原始的完整培训数据集,但其性能差也源于此限制。为了减轻这种限制,在本文中,我们利用零击量化引入的合成数据与校准数据集,我们提出了一种细粒度的数据分布对准(FDDA)方法来提高训练后量化的性能。该方法基于我们在训练网络的深层观察到的批量归一化统计(BNS)的两个重要属性,即,阶级间分离和级别的含量。为了保留这种细粒度分布信息:1)我们计算校准数据集的每级BNS作为每个类的BNS中心,并提出了BNS集中丢失,以强制不同类的合成数据分布靠近其自己的中心。 2)我们将高斯噪声添加到中心中,以模仿压力,并提出BNS扭曲的损失,以强迫同一类的合成数据分布接近扭曲的中心。通过引入这两个细粒度的损失,我们的方法显示了在想象中心上的最先进的性能,特别是当第一层和最后一层也被量化为低比特时。我们的项目可在https://github.com/zysxmu/fdda获得。
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget. Due to the limited feedback information, existing query-based black-box attack methods often require many queries for attacking each benign example. To reduce query cost, we propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. Specifically, by treating the attack on each benign example as one task, we develop a meta-learning framework by training a meta-generator to produce perturbations conditioned on benign examples. When attacking a new benign example, the meta generator can be quickly fine-tuned based on the feedback information of the new task as well as a few historical attacks to produce effective perturbations. Moreover, since the meta-train procedure consumes many queries to learn a generalizable generator, we utilize model-level adversarial transferability to train the meta-generator on a white-box surrogate model, then transfer it to help the attack against the target model. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance, which is verified by extensive experiments.
translated by 谷歌翻译
Although deep learning has made remarkable progress in processing various types of data such as images, text and speech, they are known to be susceptible to adversarial perturbations: perturbations specifically designed and added to the input to make the target model produce erroneous output. Most of the existing studies on generating adversarial perturbations attempt to perturb the entire input indiscriminately. In this paper, we propose ExploreADV, a general and flexible adversarial attack system that is capable of modeling regional and imperceptible attacks, allowing users to explore various kinds of adversarial examples as needed. We adapt and combine two existing boundary attack methods, DeepFool and Brendel\&Bethge Attack, and propose a mask-constrained adversarial attack system, which generates minimal adversarial perturbations under the pixel-level constraints, namely ``mask-constraints''. We study different ways of generating such mask-constraints considering the variance and importance of the input features, and show that our adversarial attack system offers users good flexibility to focus on sub-regions of inputs, explore imperceptible perturbations and understand the vulnerability of pixels/regions to adversarial attacks. We demonstrate our system to be effective based on extensive experiments and user study.
translated by 谷歌翻译
Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.
translated by 谷歌翻译