Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
We develop a rigorous mathematical analysis of zero-shot learning with attributes. In this setting, the goal is to label novel classes with no training data, only detectors for attributes and a description of how those attributes are correlated with the target classes, called the class-attribute matrix. We develop the first non-trivial lower bound on the worst-case error of the best map from attributes to classes for this setting, even with perfect attribute detectors. The lower bound characterizes the theoretical intrinsic difficulty of the zero-shot problem based on the available information -- the class-attribute matrix -- and the bound is practically computable from it. Our lower bound is tight, as we show that we can always find a randomized map from attributes to classes whose expected error is upper bounded by the value of the lower bound. We show that our analysis can be predictive of how standard zero-shot methods behave in practice, including which classes will likely be confused with others.
translated by 谷歌翻译
由于事后解释方法越来越多地被利用以在高风险环境中解释复杂的模型,因此确保在包括少数群体在内的各个种群亚组中,所得解释的质量始终高。例如,与与其他性别相关的实例(例如,女性)相关的实例(例如,女性)的说明不应该是与其他性别相关的解释。但是,几乎没有研究能够评估通过最先进的解释方法在输出的解释质量上是否存在这种基于群体的差异。在这项工作中,我们通过启动确定基于群体的解释质量差异的研究来解决上述差距。为此,我们首先概述了构成解释质量以及差异尤其有问题的关键属性。然后,我们利用这些属性提出了一个新的评估框架,该框架可以通过最新方法定量测量解释质量的差异。使用此框架,我们进行了严格的经验分析,以了解是否出现了解释质量的基于小组的差异。我们的结果表明,当所解释的模型复杂且高度非线性时,这种差异更可能发生。此外,我们还观察到某些事后解释方法(例如,综合梯度,外形)更有可能表现出上述差异。据我们所知,这项工作是第一个强调和研究解释质量差异的问题。通过这样做,我们的工作阐明了以前未开发的方式,其中解释方法可能在现实世界决策中引入不公平。
translated by 谷歌翻译
我们引入了构图软提示(CSP),这是一种参数有效的学习技术,可改善大规模预处理视觉模型(VLMS)的零摄像组成性。 VLM可以在其灵活的文本编码器中代表任意类作为自然语言提示,但在组成零击基准任务上的表现不佳。为了改善VLM,我们提出了一种新颖的软提示形式。我们将构成的属性和对象视为将类定义为词汇的可学习令牌,并在多个及时的构图上调整它们。在推断期间,我们在新组合中重新组装了学习的属性对象词汇。我们表明,CSP在基准数据集上的原始VLM的表现平均为AUC上的10.9个百分点。 CSP还胜过Coop,这是一种调谐前缀上下文的软提示方法,在AUC上平均要点5.8个百分点。我们执行其他实验,以表明CSP对仅属性分类,高阶属性 - 属性对象组成以及预验证属性和微调对象的组合进行了改进。
translated by 谷歌翻译
机器学习从业者通常可以访问数据的频谱:目标任务(通常是有限),未标记的数据和辅助数据的标记数据,用于其他任务的许多可用标记的数据集。我们描述了TAGLET,一个系统为学习技术,用于自动利用所有三种类型的数据并创建高质量的可服装分类器。 TAGLET的关键组件是:(1)根据知识图组织组织的辅助数据,(2)封装用于利用辅助和未标记数据的不同方法的模块,以及(3)将被整合模块组合成可用的蒸馏阶段模型。我们将TAGLETS与最先进的传输学习和半监督学习方法进行比较,四个图像分类任务。我们的研究涵盖了一系列设置,改变了标记数据的量和辅助数据的语义相关性到目标任务。我们发现,辅助和未标记数据的智能融合到多个学习技术使Taglet能够匹配 - 并且最常见的是这些替代方案。 Taglets可作为Github.com/batsresearch/taglet的开源系统使用。
translated by 谷歌翻译
最近已被证明大型语言模型在各种任务集中获得合理的零射普通化(Brown等,2020)。它已经假设这是语言模型的隐式多任务学习的结果,在语言模型中的预押(Radford等,2019)。可以通过明确的多任务学习直接引起零拍常规化?为了以缩放测试这个问题,我们开发一个系统,以便轻松地将任何自然语言任务映射到人类可读的提示表单中。我们转换一组大量的监督数据集,每个数据集都有多个提示,具有不同的措辞。这些提示的数据集允许基准测试模型执行完全看不见的任务的能力。我们介绍了一个普拉克尔编码器 - 解码器模型(Raffel等,2020; Lester等,2021),覆盖各种任务。该模型在多个标准数据集中达到强大的零点性能,通常优于其尺寸的型号超过16倍。此外,我们的方法对来自Big-替补基准测试的任务子集具有强烈性能,优于其尺寸的6倍。所有提示和培训的型号都可以在https://github.com/ bigscience-workshop / protectsource / httpsource / https://huggingface.co/bigscience/t0pp。
translated by 谷歌翻译
零拍的学习依赖于语义类表示,例如手工设计的属性或学习的嵌入方式来预测类,而无需任何标记的示例。我们建议通过将节点从矢量空间中的常识知识图中嵌入节点来学习班级表示。常识知识图是未开发的明确高级知识的来源,几乎不需要人类的努力才能应用于一系列任务。为了捕获图中的知识,我们引入了ZSL-KG,这是一种具有新型变压器图卷积网络(TRGCN)的通用框架,用于生成类表示。我们提出的TRGCN体系结构计算节点社区的非线性组合。我们的结果表明,ZSL-KG在语言和视觉中的六个零弹药基准数据集中有五个基于WordNet的方法改进了基于WordNet的方法。
translated by 谷歌翻译
Labeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train stateof-the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations. Snorkel denoises their outputs without access to ground truth by incorporating the first end-to-end implementation of our recently proposed machine learning paradigm, data programming. We present a flexible interface layer for writing labeling functions based on our experience over the past year collaborating with companies, agencies, and research labs. In a user study, subject matter experts build models 2.8× faster and increase predictive performance an average 45.5% versus seven hours of hand labeling. We study the modeling tradeoffs in this new setting and propose an optimizer for automating tradeoff decisions that gives up to 1.8× speedup per pipeline execution. In two collaborations, with the U.S. Department of Veterans Affairs and the U.S. Food and Drug Administration, and on four open-source text and image data sets representative of other deployments, Snorkel provides 132% average improvements to predictive performance over prior heuristic approaches and comes within an average 3.60% of the predictive performance of large hand-curated training sets.
translated by 谷歌翻译
Predictive simulations of the shock-to-detonation transition (SDT) in heterogeneous energetic materials (EM) are vital to the design and control of their energy release and sensitivity. Due to the complexity of the thermo-mechanics of EM during the SDT, both macro-scale response and sub-grid mesoscale energy localization must be captured accurately. This work proposes an efficient and accurate multiscale framework for SDT simulations of EM. We employ deep learning to model the mesoscale energy localization of shock-initiated EM microstructures upon which prediction results are used to supply reaction progress rate information to the macroscale SDT simulation. The proposed multiscale modeling framework is divided into two stages. First, a physics-aware recurrent convolutional neural network (PARC) is used to model the mesoscale energy localization of shock-initiated heterogeneous EM microstructures. PARC is trained using direct numerical simulations (DNS) of hotspot ignition and growth within microstructures of pressed HMX material subjected to different input shock strengths. After training, PARC is employed to supply hotspot ignition and growth rates for macroscale SDT simulations. We show that PARC can play the role of a surrogate model in a multiscale simulation framework, while drastically reducing the computation cost and providing improved representations of the sub-grid physics. The proposed multiscale modeling approach will provide a new tool for material scientists in designing high-performance and safer energetic materials.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译