Semantic segmentation from aerial views is a vital task for autonomous drones as they require precise and accurate segmentation to traverse safely and efficiently. Segmenting images from aerial views is especially challenging as they include diverse view-points, extreme scale variation and high scene complexity. To address this problem, we propose an end-to-end multi-class semantic segmentation diffusion model. We introduce recursive denoising which allows predicted error to propagate through the denoising process. In addition, we combine this with a hierarchical multi-scale approach, complementary to the diffusion process. Our method achieves state-of-the-art results on UAVid and on the Vaihingen building segmentation benchmark.
translated by 谷歌翻译
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the bio-medical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect the annotation entity's interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of Peak Ground Truth (PGT) is introduced. PGT marks the point beyond which an increase in similarity with the reference annotation stops translating to better Real World Model Performance (RWMP). Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, three categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
translated by 谷歌翻译
Accurate PhotoVoltaic (PV) power generation forecasting is vital for the efficient operation of Smart Grids. The automated design of such accurate forecasting models for individual PV plants includes two challenges: First, information about the PV mounting configuration (i.e. inclination and azimuth angles) is often missing. Second, for new PV plants, the amount of historical data available to train a forecasting model is limited (cold-start problem). We address these two challenges by proposing a new method for day-ahead PV power generation forecasts called AutoPV. AutoPV is a weighted ensemble of forecasting models that represent different PV mounting configurations. This representation is achieved by pre-training each forecasting model on a separate PV plant and by scaling the model's output with the peak power rating of the corresponding PV plant. To tackle the cold-start problem, we initially weight each forecasting model in the ensemble equally. To tackle the problem of missing information about the PV mounting configuration, we use new data that become available during operation to adapt the ensemble weights to minimize the forecasting error. AutoPV is advantageous as the unknown PV mounting configuration is implicitly reflected in the ensemble weights, and only the PV plant's peak power rating is required to re-scale the ensemble's output. AutoPV also allows to represent PV plants with panels distributed on different roofs with varying alignments, as these mounting configurations can be reflected proportionally in the weighting. Additionally, the required computing memory is decoupled when scaling AutoPV to hundreds of PV plants, which is beneficial in Smart Grids with limited computing capabilities. For a real-world data set with 11 PV plants, the accuracy of AutoPV is comparable to a model trained on two years of data and outperforms an incrementally trained model.
translated by 谷歌翻译
Quantifying the perceptual similarity of two images is a long-standing problem in low-level computer vision. The natural image domain commonly relies on supervised learning, e.g., a pre-trained VGG, to obtain a latent representation. However, due to domain shift, pre-trained models from the natural image domain might not apply to other image domains, such as medical imaging. Notably, in medical imaging, evaluating the perceptual similarity is exclusively performed by specialists trained extensively in diverse medical fields. Thus, medical imaging remains devoid of task-specific, objective perceptual measures. This work answers the question: Is it necessary to rely on supervised learning to obtain an effective representation that could measure perceptual similarity, or is self-supervision sufficient? To understand whether recent contrastive self-supervised representation (CSR) may come to the rescue, we start with natural images and systematically evaluate CSR as a metric across numerous contemporary architectures and tasks and compare them with existing methods. We find that in the natural image domain, CSR behaves on par with the supervised one on several perceptual tests as a metric, and in the medical domain, CSR better quantifies perceptual similarity concerning the experts' ratings. We also demonstrate that CSR can significantly improve image quality in two image synthesis tasks. Finally, our extensive results suggest that perceptuality is an emergent property of CSR, which can be adapted to many image domains without requiring annotations.
translated by 谷歌翻译
Angluin的L*算法使用会员资格和等价查询了解了常规语言的最低(完整)确定性有限自动机(DFA)。它的概率近似正确(PAC)版本用足够大的随机会员查询替换等效查询,以使答案获得高级信心。因此,它可以应用于任何类型的(也是非规范)设备,可以将其视为合成自动机的算法,该算法根据观测值抽象该设备的行为。在这里,我们对Angluin的PAC学习算法对通过引入一些噪音从DFA获得的设备感兴趣。更确切地说,我们研究盎格鲁因算法是否会降低噪声并产生与原始设备更接近原始设备的DFA。我们提出了几种介绍噪声的方法:(1)嘈杂的设备将单词的分类W.R.T.倒置。具有很小概率的DFA,(2)嘈杂的设备在询问其分类W.R.T.之前用小概率修改了单词的字母。 DFA和(3)嘈杂的设备结合了W.R.T.单词的分类。 DFA及其分类W.R.T.柜台自动机。我们的实验是在数百个DFA上进行的。直言不讳地表明,我们的主要贡献表明:(1)每当随机过程产生嘈杂的设备时,盎格鲁因算法的行为都很好,(2)但使用结构化的噪声却很差,并且(3)几乎肯定是随机性的产量具有非竞争性语言的系统。
translated by 谷歌翻译
财产数据的可用性是化学过程开发中的主要瓶颈之一,通常需要耗时且昂贵的实验或将设计空间限制为少数已知分子。这种瓶颈一直是预测性财产模型持续发展的动机。对于新分子的性质预测,群体贡献方法一直在开创性。最近,机器学习加入了更具成熟的财产预测模型。但是,即使取得了最近的成功,将物理约束集成到机器学习模型中仍然具有挑战性。物理约束对于许多热力学特性,例如吉布斯 - 杜纳姆(Gibbs-Dunham)关系至关重要,它将额外的复杂性层引入预测中。在这里,我们介绍了SPT-NRTL,这是一种机器学习模型,以预测热力学一致的活动系数并提供NRTL参数,以便于过程模拟。结果表明,SPT-NRTL在所有官能团的活性系数预测中的精度高于UNIFAC,并且能够以几乎实验的精度预测许多蒸气 - 液位均衡性,如示例性混合物所示。 N-己烷。为了简化SPT-NRTL的应用,用SPT-NRTL计算了100 000 000的NRTL参数,并在线提供。
translated by 谷歌翻译
持续学习(CL,有时也称为增量学习)是机器学习的一种味道,在该口味中,通常会放松或省略固定数据分布的通常假设。当天然应用时,例如CL问题中的DNNS时,数据分布的变化会导致所谓的灾难性遗忘(CF)效应:突然丧失了先前的知识。尽管近年来已经为启用CL做出了许多重大贡献,但大多数作品都解决了受监督的(分类)问题。本文回顾了在其他环境中研究CL的文献,例如通过减少监督,完全无监督的学习和强化学习的学习。除了提出一个简单的模式用于分类CL方法W.R.T.他们的自主权和监督水平,我们讨论了与每种设置相关的具体挑战以及对CL领域的潜在贡献。
translated by 谷歌翻译
在收获前的作物产量的准确预测对于世界各地的作物物流,市场计划和食物分配至关重要。产量预测需要在延长的时间段内监测物候和气候特征,以模拟农作物发育中涉及的复杂关系。绕过世界各种卫星提供的遥感卫星图像是获取数据预测数据的廉价且可靠的方法。目前,收益率预测的领域由深度学习方法主导。尽管使用这些方法达到的精度是有希望的,但所需的数据量和``Black-Box''性质可以限制深度学习方法的应用。可以通过提出一条管道将遥感图像处理为基于特征的表示形式来克服局限性,该图像允许使用极端梯度提升(XGBoost)进行产量预测。与基于深度学习的最先进的收益率预测系统相比,对美国大豆产量预测的比较评估显示出了有希望的预测准确性。特征重要性将近红外光谱视为我们模型中的重要特征。报告的结果暗示了XGBoost进行产量预测的能力,并鼓励将来对XGBoost进行XGBoost的实验,以对世界各地的其他农作物进行产量预测。
translated by 谷歌翻译
由编码器和解码器组成的自动编码器被广泛用于机器学习,以缩小高维数据的尺寸。编码器将输入数据歧管嵌入到较低的潜在空间中,而解码器表示反向映射,从而提供了潜在空间中的歧管的数据歧管的参数化。嵌入式歧管的良好规律性和结构可以实质性地简化进一步的数据处理任务,例如群集分析或数据插值。我们提出并分析了一种新的正则化,以学习自动编码器的编码器组件:一种损失功能,可倾向于等距,外层平坦的嵌入,并允许自行训练编码器。为了进行训练,假定对于输入歧管上的附近点,他们的本地riemannian距离及其本地riemannian平均水平可以评估。损失函数是通过蒙特卡洛集成计算的,具有不同的采样策略,用于输入歧管上的一对点。我们的主要定理将嵌入图的几何损失函数识别为$ \ gamma $ - 依赖于采样损失功能的限制。使用编码不同明确给定的数据歧管的图像数据的数值测试表明,将获得平滑的歧管嵌入到潜在空间中。由于促进了外部平坦度,这些嵌入足够规律,因此在潜在空间中线性插值可以作为一种可能的后处理。
translated by 谷歌翻译
脑小血管疾病的成像标记提供了有关脑部健康的宝贵信息,但是它们的手动评估既耗时又受到实质性内部和间际变异性的阻碍。自动化评级可能受益于生物医学研究以及临床评估,但是现有算法的诊断可靠性尚不清楚。在这里,我们介绍了\ textIt {血管病变检测和分割}(\ textit {v textit {where valdo?})挑战,该挑战是在国际医学图像计算和计算机辅助干预措施(MICCAI)的卫星事件中运行的挑战(MICCAI) 2021.这一挑战旨在促进大脑小血管疾病的小而稀疏成像标记的自动检测和分割方法的开发,即周围空间扩大(EPVS)(任务1),脑微粒(任务2)和预先塑造的鞋类血管起源(任务3),同时利用弱和嘈杂的标签。总体而言,有12个团队参与了针对一个或多个任务的解决方案的挑战(任务1 -EPVS 4,任务2 -Microbleeds的9个,任务3 -lacunes的6个)。多方数据都用于培训和评估。结果表明,整个团队和跨任务的性能都有很大的差异,对于任务1- EPV和任务2-微型微型且对任务3 -lacunes尚无实际的结果,其结果尤其有望。它还强调了可能阻止个人级别使用的情况的性能不一致,同时仍证明在人群层面上有用。
translated by 谷歌翻译