使用卫星图像的建筑物分类对于诸如损害评估,资源分配和人口估算的若干应用而言变得越来越重要。在这项工作中,我们专注于建筑物损伤评估(BDA)和住宅和非住宅建筑的建筑物类型分类(BTC)。我们建议仅依赖于RGB卫星图像并遵循基于2级的深度学习的方法,其中使用语义分割模型提取建筑物的足迹,然后进行裁剪图像的分类。由于缺乏住宅/非住宅建筑物分类的适当数据集,我们介绍了一个新的高分辨率卫星图像数据集。我们进行广泛的实验,选择最佳的超参数,模型架构和培训范式,我们提出了一种新的转移基于学习的方法,以优于经典方法。最后,我们验证了两种应用中提出的方法,呈现出卓越的准确性和F1分数指标。
translated by 谷歌翻译
利用相对高的像素 - 明智的度量分数,正在实现使用相对卷积神经网络的编码器解码器中存在的卫星图像中存在的建筑物的语义分割。在本文中,我们的目标是利用实例分段任务的完全卷积神经网络的力量,并使用额外添加的类与流域处理技术一起利用更好的对象度量结果来利用。我们还显示Cutmix混合数据增强和单周期学习率政策是更大的正则化方法,以实现更好的培训数据和提高性能。此外,混合精度训练提供了更灵活的来试验更大的网络和批次,同时保持训练期间的稳定性和收敛性。我们比较并显示在我们整个管道中的这些额外变化的效果,最终提供了一个已被证明更好地执行的调谐超参数。
translated by 谷歌翻译
建筑物分割是地球观测和空中图像分析领域的基本任务。最现有的基于深度学习的文献中的基于深度学习的算法可以应用于固定或窄的空间分辨率图像。在实践方案中,用户处理广泛的图像分辨率,因此,通常需要重新确定给定的空中图像以匹配用于训练深度学习模型的数据集的空间分辨率。然而,这将导致输出分割掩模的质量严重降级。要处理此问题,我们提出了这项研究,该研究是能够在不同空间分辨率下的空中图像中存在的建筑物的规模不变神经网络(SCI-NET)。具体而言,我们修改了U-Net架构并用密集的空间金字塔池(ASPP)融合,以提取细粒度的多尺度表示。我们将拟议模型对开放城市AI DataSet上的若干艺术模型的拟议模型进行了比较,并显示了SCI-Net在数据集中可用的所有分辨率方面提供稳定的改进余量。
translated by 谷歌翻译
Simulation-based falsification is a practical testing method to increase confidence that the system will meet safety requirements. Because full-fidelity simulations can be computationally demanding, we investigate the use of simulators with different levels of fidelity. As a first step, we express the overall safety specification in terms of environmental parameters and structure this safety specification as an optimization problem. We propose a multi-fidelity falsification framework using Bayesian optimization, which is able to determine at which level of fidelity we should conduct a safety evaluation in addition to finding possible instances from the environment that cause the system to fail. This method allows us to automatically switch between inexpensive, inaccurate information from a low-fidelity simulator and expensive, accurate information from a high-fidelity simulator in a cost-effective way. Our experiments on various environments in simulation demonstrate that multi-fidelity Bayesian optimization has falsification performance comparable to single-fidelity Bayesian optimization but with much lower cost.
translated by 谷歌翻译
Fine-tuning a Pre-trained Language Model (PLM) on a specific downstream task has been a well-known paradigm in Natural Language Processing. However, with the ever-growing size of PLMs, training the entire model on several downstream tasks becomes very expensive and resource-hungry. Recently, different Parameter Efficient Tuning (PET) techniques are proposed to improve the efficiency of fine-tuning PLMs. One popular category of PET methods is the low-rank adaptation methods which insert learnable truncated SVD modules into the original model either sequentially or in parallel. However, low-rank decomposition suffers from limited representation power. In this work, we address this problem using the Kronecker product instead of the low-rank representation. We introduce KronA, a Kronecker product-based adapter module for efficient fine-tuning of Transformer-based PLMs. We apply the proposed methods for fine-tuning T5 on the GLUE benchmark to show that incorporating the Kronecker-based modules can outperform state-of-the-art PET methods.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
AI有可能通过实施高级自动化来改善人才管理的方法,从而实现动态规定。这项研究旨在确定开发面向AI的工件以解决人才管理问题的新要求。设计工件专注于增强专业评估和计划属性之间的互动,是一种智能的就业自动化解决方案,用于职业指导,主要取决于人才智能模块和个人成长需求。采用了设计科学方法,用于通过结构化机器学习技术进行实验研究,这是通过提出的技术 - 组织 - 环境理论的拟议中的综合AI解决方案框架的主要要素。
translated by 谷歌翻译
远期操作员的计算成本和选择适当的先前分布的计算成本挑战了贝叶斯对高维逆问题的推断。摊销的变异推理解决了这些挑战,在这些挑战中,训练神经网络以近似于现有模型和数据对的后验分布。如果以前看不见的数据和正态分布的潜在样品作为输入,则预处理的深神经网络(在我们的情况下是有条件的正常化流量)几乎没有成本的后验样品。然而,这种方法的准确性取决于高保真训练数据的可用性,由于地球的异质结构,由于地球物理逆问题很少存在。此外,准确的摊销变异推断需要从训练数据分布中汲取观察到的数据。因此,我们建议通过基于物理学的校正对有条件的归一化流量分布来提高摊销变异推断的弹性。为了实现这一目标,我们不是标准的高斯潜在分布,我们通过具有未知平均值和对角线协方差的高斯分布来对潜在分布进行参数化。然后,通过最小化校正后分布和真实后验分布之间的kullback-leibler差异来估算这些未知数量。尽管通用和适用于其他反问题,但通过地震成像示例,我们表明我们的校正步骤可提高摊销变异推理的鲁棒性,以相对于源实验数量的变化,噪声方差以及先前分布的变化。这种方法提供了伪像有限的地震图像,并评估其不确定性,其成本大致与五个反度迁移相同。
translated by 谷歌翻译
安全字段中的数据标签通常是嘈杂,有限或偏向于人口子集的。结果,诸如准确性,精度和召回指标之类的普遍评估方法,或从标记数据集中计算的性能曲线的分析对机器学习(ML)模型的现实性能没有足够的信心。这减慢了该领域的机器学习的采用。在当今的行业中,我们依靠域专业知识和冗长的手动评估来建立此信心,然后再运送新的安全应用程序模型。在本文中,我们介绍了Firenze,这是一种使用域专业知识对ML模型的性能进行比较评估的新型框架,并编码为称为标记的可扩展功能。我们表明,在称为感兴趣的区域的样本中计算和组合的标记可以提供对其现实世界表演的强大估计。至关重要的是,我们使用统计假设检验来确保观察到的差异,因此从我们的框架中得出的结论 - 比仅噪声可观察到的更为突出。使用模拟和两个现实世界数据集用于恶意软件和域名声誉检测,我们说明了方法的有效性,局限性和见解。综上所述,我们建议Firenze作为研究人员,领域专家和企业主混合团队的快速,可解释和协作模型开发和评估的资源。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译