The French National Institute of Geographical and Forest Information (IGN) has the mission to document and measure land-cover on French territory and provides referential geographical datasets, including high-resolution aerial images and topographic maps. The monitoring of land-cover plays a crucial role in land management and planning initiatives, which can have significant socio-economic and environmental impact. Together with remote sensing technologies, artificial intelligence (IA) promises to become a powerful tool in determining land-cover and its evolution. IGN is currently exploring the potential of IA in the production of high-resolution land cover maps. Notably, deep learning methods are employed to obtain a semantic segmentation of aerial images. However, territories as large as France imply heterogeneous contexts: variations in landscapes and image acquisition make it challenging to provide uniform, reliable and accurate results across all of France. The FLAIR-one dataset presented is part of the dataset currently used at IGN to establish the French national reference land cover map "Occupation du sol \`a grande \'echelle" (OCS- GE).
translated by 谷歌翻译
In this paper we propose a general approach to define a many-valued preferential interpretation of gradual argumentation semantics. The approach allows for conditional reasoning over arguments and boolean combination of arguments, with respect to a class of gradual semantics, through the verification of graded (strict or defeasible) implications over a preferential interpretation. As a proof of concept, in the finitely-valued case, an Answer set Programming approach is proposed for conditional reasoning in a many-valued argumentation semantics of weighted argumentation graphs. The paper also develops and discusses a probabilistic semantics for gradual argumentation, which builds on the many-valued conditional semantics.
translated by 谷歌翻译
Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learning rate regime. Despite a flurry of recent works on this topic, however, the latter effect is still poorly understood. In this paper, we take a step towards understanding genuinely non-convex training dynamics with large learning rates by performing a detailed analysis of gradient descent for simplified models of two-layer neural networks. For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i.e., neurons with a non-zero first-layer bias). This elucidates one possible mechanism by which the edge of stability can in fact lead to better generalization, as threshold neurons are basic building blocks with useful inductive bias for many tasks.
translated by 谷歌翻译
We introduce the XPER (eXplainable PERformance) methodology to measure the specific contribution of the input features to the predictive or economic performance of a model. Our methodology offers several advantages. First, it is both model-agnostic and performance metric-agnostic. Second, XPER is theoretically founded as it is based on Shapley values. Third, the interpretation of the benchmark, which is inherent in any Shapley value decomposition, is meaningful in our context. Fourth, XPER is not plagued by model specification error, as it does not require re-estimating the model. Fifth, it can be implemented either at the model level or at the individual level. In an application based on auto loans, we find that performance can be explained by a surprisingly small number of features. XPER decompositions are rather stable across metrics, yet some feature contributions switch sign across metrics. Our analysis also shows that explaining model forecasts and model performance are two distinct tasks.
translated by 谷歌翻译
We introduce a parametric view of non-local two-step denoisers, for which BM3D is a major representative, where quadratic risk minimization is leveraged for unsupervised optimization. Within this paradigm, we propose to extend the underlying mathematical parametric formulation by iteration. This generalization can be expected to further improve the denoising performance, somehow curbed by the impracticality of repeating the second stage for all two-step denoisers. The resulting formulation involves estimating an even larger amount of parameters in a unsupervised manner which is all the more challenging. Focusing on the parameterized form of NL-Ridge, the simplest but also most efficient non-local two-step denoiser, we propose a progressive scheme to approximate the parameters minimizing the risk. In the end, the denoised images are made up of iterative linear combinations of patches. Experiments on artificially noisy images but also on real-world noisy images demonstrate that our method compares favorably with the very best unsupervised denoisers such as WNNM, outperforming the recent deep-learning-based approaches, while being much faster.
translated by 谷歌翻译
The detection of state-sponsored trolls acting in information operations is an unsolved and critical challenge for the research community, with repercussions that go beyond the online realm. In this paper, we propose a novel AI-based solution for the detection of state-sponsored troll accounts, which consists of two steps. The first step aims at classifying trajectories of accounts' online activities as belonging to either a state-sponsored troll or to an organic user account. In the second step, we exploit the classified trajectories to compute a metric, namely "troll score", which allows us to quantify the extent to which an account behaves like a state-sponsored troll. As a study case, we consider the troll accounts involved in the Russian interference campaign during the 2016 US Presidential election, identified as Russian trolls by the US Congress. Experimental results show that our approach identifies accounts' trajectories with an AUC close to 99\% and, accordingly, classify Russian trolls and organic users with an AUC of 97\%. Finally, we evaluate whether the proposed solution can be generalized to different contexts (e.g., discussions about Covid-19) and generic misbehaving users, showing promising results that will be further expanded in our future endeavors.
translated by 谷歌翻译
Reflection high-energy electron diffraction (RHEED) is a powerful tool in molecular beam epitaxy (MBE), but RHEED images are often difficult to interpret, requiring experienced operators. We present an approach for automated surveillance of GaAs substrate deoxidation in MBE reactors using deep learning based RHEED image-sequence classification. Our approach consists of an non-supervised auto-encoder (AE) for feature extraction, combined with a supervised convolutional classifier network. We demonstrate that our lightweight network model can accurately identify the exact deoxidation moment. Furthermore we show that the approach is very robust and allows accurate deoxidation detection during months without requiring re-training. The main advantage of the approach is that it can be applied to raw RHEED images without requiring further information such as the rotation angle, temperature, etc.
translated by 谷歌翻译
近年来,关于如何在公平限制下学习机器学习模型的越来越多的工作,通常在某些敏感属性方面表达。在这项工作中,我们考虑了对手对目标模型具有黑箱访问的设置,并表明对手可以利用有关该模型公平性的信息,以增强他对训练数据敏感属性的重建。更确切地说,我们提出了一种通用的重建校正方法,该方法将其作为对手进行的初始猜测,并纠正它以符合某些用户定义的约束(例如公平信息),同时最大程度地减少了对手猜测的变化。提出的方法对目标模型的类型,公平感知的学习方法以及对手的辅助知识不可知。为了评估我们的方法的适用性,我们对两种最先进的公平学习方法进行了彻底的实验评估,使用四个具有广泛公差的不同公平指标以及三个不同大小和敏感属性的数据集。实验结果证明了提出的方法改善训练集敏感属性的重建的有效性。
translated by 谷歌翻译
为了减轻模型中不希望的偏差的影响,几种方法建议预先处理输入数据集,以通过防止敏感属性的推断来减少歧视风险。不幸的是,这些预处理方法中的大多数导致一代新分布与原始分布有很大不同,因此通常导致不切实际的数据。作为副作用,这种新的数据分布意味着需要重新训练现有模型才能做出准确的预测。为了解决这个问题,我们提出了一种新颖的预处理方法,我们将根据保护组的分布转换为所选目标一个,并具有附加的隐私约束,其目的是防止敏感敏感的推断属性。更确切地说,我们利用Wasserstein Gan和Attgan框架的最新作品来实现数据点的最佳运输以及强制保护属性推断的歧视器。我们提出的方法可以保留数据的可解释性,并且可以在不定义敏感组的情况下使用。此外,我们的方法可以专门建模现有的最新方法,从而提出对这些方法的统一观点。最后,关于真实和合成数据集的一些实验表明,我们的方法能够隐藏敏感属性,同时限制数据的变形并改善了后续数据分析任务的公平性。
translated by 谷歌翻译
最近的作品表明,基于GAN的变形攻击的可行性与基于具有里程碑意义的方法的成功率相似。这种新型的“深”形态可能需要开发新的足够检测器来保护面部识别系统。我们根据光谱特征和LBP直方图特征以及CNN模型探索简单的深色检测基准,包括在dataset和交叉数据库中。我们观察到,简单的基于LBP的系统已经在数据内设置中已经非常准确,但是与概括斗争,这种现象通过将其中的几个系统融合在一起而部分缓解了得分级别。我们得出的结论是,对GAN图像检测有效的有效的重新连接是最有效的总体,达到了完美的准确性。但是,我们注意到,基于LBP的系统保持一定的兴趣:除了其较低的计算要求和相对于CNN的可解释性增加,LBP+Resnet Fusions有时还会显示性能提高,而基于RESNET的性能也暗示基于LBP的系统可以集中精力关于有意义的信号,不一定是由CNN检测器拾取的。
translated by 谷歌翻译