This paper presents a subsampling-task paradigm for data-driven task-specific experiment design (ED) and a novel method in populationwide supervised feature selection (FS). Optimal ED, the choice of sampling points under constraints of limited acquisition-time, arises in a wide variety of scientific and engineering contexts. However the continuous optimization used in classical approaches depend on a-priori parameter choices and challenging non-convex optimization landscapes. This paper proposes to replace this strategy with a subsampling-task paradigm, analogous to populationwide supervised FS. In particular, we introduce JOFSTO, which performs JOint Feature Selection and Task Optimization. JOFSTO jointly optimizes two coupled networks: one for feature scoring, which provides the ED, the other for execution of a downstream task or process. Unlike most FS problems, e.g. selecting protein expressions for classification, ED problems typically select from highly correlated globally informative candidates rather than seeking a small number of highly informative features among many uninformative features. JOFSTO's construction efficiently identifies potentially correlated, but effective subsets and returns a trained task network. We demonstrate the approach using parameter estimation and mapping problems in clinically-relevant applications in quantitative MRI and in hyperspectral imaging. Results from simulations and empirical data show the subsampling-task paradigm strongly outperforms classical ED, and within our paradigm, JOFSTO outperforms state-of-the-art supervised FS techniques. JOFSTO extends immediately to wider image-based ED problems and other scenarios where the design must be specified globally across large numbers of acquisitions. Code will be released.
translated by 谷歌翻译
我们提出了Prosub:渐进式采样,这是一种基于深度学习的自动化方法,该方法是一个过采样的数据集(例如,多通道的3D图像),信息损失最小。我们以最近的双NETWORK方法为基础,该方法赢得了MICCAI多扩散(MUDI)定量MRI测量测量取样重建挑战,但通过在艰难的决策边界进行下采样,遭受了深度学习训练的不稳定。 Prosub使用递归功能消除(RFE)的范式,并在深度学习训练期间逐步进行亚子样本测量,从而提高优化稳定性。 Prosub还集成了神经体系结构搜索(NAS)范式,从而允许网络体系结构超参数响应亚群采样过程。我们显示,Prosub优于Mudi Miccai挑战的获胜者,在MUDI挑战子任务和对临床应用有用的下游过程的定性改进方面产生了> 18%的MSE。我们还展示了合并NAS并分析Prosub组件的效果的好处。由于我们的方法概括了除MRI测量选择重建之外的其他问题,因此我们的代码是https://github.com/sbb-gh/prosub
translated by 谷歌翻译
大型视力模型的无监督预训练方法已显示出可以提高下游监督任务的性能。为卫星图像开发类似的技术带来了重要的机会,因为未标记的数据很丰富,并且固有的时间和多光谱结构提供了途径,以进一步改善现有的训练策略。在本文中,我们提出了Satmae,这是基于蒙面自动编码器(MAE)的时间或多光谱卫星图像的预训练框架。为了利用时间信息,我们包括一个时间嵌入以及跨时间独立掩盖图像贴片。此外,我们证明将多光谱数据编码为具有不同光谱位置编码的频段组是有益的。我们的方法在基准数据集(最高$ \ uparrow $ 7 \%)上的监督学习绩效方面都对先前最先前的技术产生了强大的改进,以及在下游遥感任务(包括土地)上的转移学习绩效封面分类(最多$ \ uparrow $ 14 \%)和语义细分。
translated by 谷歌翻译
难以理解的AI系统很难信任,尤其是当它们在自动驾驶(例如自动驾驶)等安全环境中运行时。因此,有必要建立透明且可查询的系统以提高信任水平。我们提出了一种基于现有的称为IGP2的现有白盒系统的自动驾驶汽车运动计划和预测的透明,以人为中心的解释生成方法。我们的方法将贝叶斯网络与无上下文生成规则相结合,并可以为自动驾驶汽车的高级驾驶行为提供因果自然语言解释。对模拟方案的初步测试表明,我们的方法捕获了自动驾驶汽车行动背后的原因,并产生了具有不同复杂性的可理解解释。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
对联合国可持续发展目标的进展(SDGS)因关键环境和社会经济指标缺乏数据而受到阻碍,其中历史上有稀疏时间和空间覆盖率的地面调查。机器学习的最新进展使得可以利用丰富,频繁更新和全球可用的数据,例如卫星或社交媒体,以向SDGS提供洞察力。尽管有希望的早期结果,但到目前为止使用此类SDG测量数据的方法在很大程度上在不同的数据集或使用不一致的评估指标上进行了评估,使得难以理解的性能是改善,并且额外研究将是最丰富的。此外,处理卫星和地面调查数据需要域知识,其中许多机器学习群落缺乏。在本文中,我们介绍了3个SDG的3个基准任务的集合,包括与经济发展,农业,健康,教育,水和卫生,气候行动和陆地生命相关的任务。 15个任务中的11个数据集首次公开发布。我们为Acceptandbench的目标是(1)降低机器学习界的进入的障碍,以促进衡量和实现SDGS; (2)提供标准基准,用于评估各种SDG的任务的机器学习模型; (3)鼓励开发新颖的机器学习方法,改进的模型性能促进了对SDG的进展。
translated by 谷歌翻译
高分辨率卫星图像已证明是可用于广泛的任务,包括衡量全球人口,当地经济生计和生物多样性,其中许多其他任务。不幸的是,高分辨率图像既不经常收集,购买昂贵,难以高效,有效地缩放这些下游任务在两次和空间。我们提出了一种新的条件像素综合模型,它使用丰富,低成本,低分辨率的图像,在位置和时间内产生准确的高分辨率图像。我们表明,我们的模型在钥匙下游任务 - 对象计数上达到了照片 - 现实的样本质量和竞争基线的竞争基线 - 特别是在地面上的条件正在快速变化的地理位置中。
translated by 谷歌翻译
It is well known that conservative mechanical systems exhibit local oscillatory behaviours due to their elastic and gravitational potentials, which completely characterise these periodic motions together with the inertial properties of the system. The classification of these periodic behaviours and their geometric characterisation are in an on-going secular debate, which recently led to the so-called eigenmanifold theory. The eigenmanifold characterises nonlinear oscillations as a generalisation of linear eigenspaces. With the motivation of performing periodic tasks efficiently, we use tools coming from this theory to construct an optimization problem aimed at inducing desired closed-loop oscillations through a state feedback law. We solve the constructed optimization problem via gradient-descent methods involving neural networks. Extensive simulations show the validity of the approach.
translated by 谷歌翻译
Detecting anomalous data within time series is a very relevant task in pattern recognition and machine learning, with many possible applications that range from disease prevention in medicine, e.g., detecting early alterations of the health status before it can clearly be defined as "illness" up to monitoring industrial plants. Regarding this latter application, detecting anomalies in an industrial plant's status firstly prevents serious damages that would require a long interruption of the production process. Secondly, it permits optimal scheduling of maintenance interventions by limiting them to urgent situations. At the same time, they typically follow a fixed prudential schedule according to which components are substituted well before the end of their expected lifetime. This paper describes a case study regarding the monitoring of the status of Laser-guided Vehicles (LGVs) batteries, on which we worked as our contribution to project SUPER (Supercomputing Unified Platform, Emilia Romagna) aimed at establishing and demonstrating a regional High-Performance Computing platform that is going to represent the main Italian supercomputing environment for both computing power and data volume.
translated by 谷歌翻译
Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.
translated by 谷歌翻译