单细胞RNA-seq数据集的大小和复杂性正在增长,从而可以研究各种生物/临床环境中的细胞组成变化。可扩展的降低性降低技术需要消除它们的生物学变异,同时考虑技术和生物混杂因素。在这项工作中,我们扩展了一种流行的概率非线性维度降低的方法,即高斯过程潜在变量模型,以扩展到大量的单细胞数据集,同时明确考虑技术和生物混杂因素。关键思想是使用增强的内核,该内核可以保留下限的可分式性,从而允许快速随机变化推断。我们证明了其在Kumasaka等人中重建先天免疫的潜在潜在签名的能力。 (2021)训练时间较低9倍。我们进一步分析了一个共同数据集并在130个人群中证明了该框架,该框架可以在捕获可解释的感染签名的同时进行数据集成。具体而言,我们探讨了互联的严重程度,作为优化患者分层并捕获疾病特异性基因表达的潜在维度。
translated by 谷歌翻译
Remote sensing imagery provides comprehensive views of the Earth, where different sensors collect complementary data at different spatial scales. Large, pretrained models are commonly finetuned with imagery that is heavily augmented to mimic different conditions and scales, with the resulting models used for various tasks with imagery from a range of spatial scales. Such models overlook scale-specific information in the data. In this paper, we present Scale-MAE, a pretraining method that explicitly learns relationships between data at different, known scales throughout the pretraining process. Scale-MAE pretrains a network by masking an input image at a known input scale, where the area of the Earth covered by the image determines the scale of the ViT positional encoding, not the image resolution. Scale-MAE encodes the masked image with a standard ViT backbone, and then decodes the masked image through a bandpass filter to reconstruct low/high frequency images at lower/higher scales. We find that tasking the network with reconstructing both low/high frequency images leads to robust multiscale representations for remote sensing imagery. Scale-MAE achieves an average of a $5.0\%$ non-parametric kNN classification improvement across eight remote sensing datasets compared to current state-of-the-art and obtains a $0.9$ mIoU to $3.8$ mIoU improvement on the SpaceNet building segmentation transfer task for a range of evaluation scales.
translated by 谷歌翻译
We test grip strength and shock absorption properties of various granular material in granular jamming robotic components. The granular material comprises a range of natural, manufactured, and 3D printed material encompassing a wide range of shapes, sizes, and Shore hardness. Two main experiments are considered, both representing compelling use cases for granular jamming in soft robotics. The first experiment measures grip strength (retention force measured in Newtons) when we fill a latex balloon with the chosen grain type and use it as a granular jamming gripper to pick up a range of test objects. The second experiment measures shock absorption properties recorded by an Inertial Measurement Unit which is suspended in an envelope of granular material and dropped from a set height. Our results highlight a range of shape, size and softness effects, including that grain deformability is a key determinant of grip strength, and interestingly, that larger grain sizes in 3D printed grains create better shock absorbing materials.
translated by 谷歌翻译
Fruit harvesting has recently experienced a shift towards soft grippers that possess compliance, adaptability, and delicacy. In this context, pneumatic grippers are popular, due to provision of high deformability and compliance, however they typically possess limited grip strength. Jamming possesses strong grip capability, however has limited deformability and often requires the object to be pushed onto a surface to attain a grip. This paper describes a hybrid gripper combining pneumatics (for deformation) and jamming (for grip strength). Our gripper utilises a torus (donut) structure with two chambers controlled by pneumatic and vacuum pressure respectively, to conform around a target object. The gripper displays good adaptability, exploiting pneumatics to mould to the shape of the target object where jamming can be successfully harnessed to grip. The main contribution of the paper is design, fabrication, and characterisation of the first hybrid gripper that can use granular jamming in free space, achieving significantly larger retention forces compared to pure pneumatics. We test our gripper on a range of different sizes and shapes, as well as picking a broad range of real fruit.
translated by 谷歌翻译
Backdoor attacks have emerged as one of the major security threats to deep learning models as they can easily control the model's test-time predictions by pre-injecting a backdoor trigger into the model at training time. While backdoor attacks have been extensively studied on images, few works have investigated the threat of backdoor attacks on time series data. To fill this gap, in this paper we present a novel generative approach for time series backdoor attacks against deep learning based time series classifiers. Backdoor attacks have two main goals: high stealthiness and high attack success rate. We find that, compared to images, it can be more challenging to achieve the two goals on time series. This is because time series have fewer input dimensions and lower degrees of freedom, making it hard to achieve a high attack success rate without compromising stealthiness. Our generative approach addresses this challenge by generating trigger patterns that are as realistic as real-time series patterns while achieving a high attack success rate without causing a significant drop in clean accuracy. We also show that our proposed attack is resistant to potential backdoor defenses. Furthermore, we propose a novel universal generator that can poison any type of time series with a single generator that allows universal attacks without the need to fine-tune the generative model for new time series datasets.
translated by 谷歌翻译
开放词汇模型是图像分类的有希望的新范式。与传统的分类模型不同,开放词汇模型在推理过程中用自然语言指定的任何任意类别中分类。这种称为“提示”的自然语言通常由一组手写的模板(例如,“ {}”的照片)组成,这些模板与每个类别名称完成。这项工作引入了一种简单的方法,可以生成更高的准确性提示,而无需对图像域的明确知识和更少的手工构造句子。为了实现这一目标,我们将开放式词汇模型与大语言模型(LLMS)相结合,以通过语言模型(Cupl,发音为“夫妇”)创建自定义提示。特别是,我们利用LLMS中包含的知识来生成许多针对每个对象类别定制的描述性句子。我们发现,这种直接和一般的方法可提高一系列零照片分类基准的准确性,包括ImageNet上超过一个百分比的增益。最后,此方法不需要额外的培训,并且仍然完全零射。代码可在https://github.com/sarahpratt/cupl上找到。
translated by 谷歌翻译
机器学习和临床研究社区利用现实世界数据(RWD)的方法,包括电子健康记录中捕获的数据(EHR)截然不同。虽然临床研究人员谨慎使用RWD进行临床研究,但用于医疗团队的ML会消费公共数据集,并以最少的审查来开发新算法。这项研究通过开发和验证ML-DQA来弥合这一差距,ML-DQA是基于RWD最佳实践的数据质量保证框架。 ML-DQA框架适用于两个地理位置的五个ML项目,分别是不同的医疗状况和不同的人群。在这五个项目中,共收集了247,536名患者的RWD,共有2,999项质量检查和24份质量报告。出现了五种可推广的实践:所有项目都使用类似的方法来分组冗余数据元素表示;所有项目都使用自动实用程序来构建诊断和药物数据元素;所有项目都使用了一个共同的基于规则的转换库;所有项目都使用统一的方法将数据质量检查分配给数据元素;所有项目都使用类似的临床裁决方法。包括临床医生,数据科学家和受训者在内的平均有5.8个人参与每个项目实施ML-DQA,每个项目平均进行了23.4个数据元素。这项研究证明了ML-DQA在医疗项目中的重要性作用,并为团队提供了开展这些基本活动的框架。
translated by 谷歌翻译
人们普遍认为,人工智能(AI)系统,尤其是使用机器学习(ML)的系统,应该能够“解释”其行为。不幸的是,关于什么构成“解释”几乎没有共识。这引起了系统为可解释的人工智能(XAI)提供的解释与用户和其他受众真正需要的解释之间的解释,这些解释应由全部功能角色,受众,受众和解释能力的全部范围定义。在本文中,我们探讨了解释的特征以及如何使用这些功能评估其实用性。我们专注于根据其功能角色定义的解释要求,试图理解它们的用户的知识状态以及生成它们所需的信息的可用性。此外,我们讨论了XAI对系统的信任的风险,而无需建立他们的信任度,并为XAI领域建立指标以指导和基础系统生成的解释的实用性定义了关键的下一步。
translated by 谷歌翻译
为了使人工代理在不断变化的环境中执行有用的任务,它们必须能够检测并适应新颖性。但是,视觉新颖性检测研究通常仅在重新利用的数据集(例如最初用于对象分类的CIFAR-10)上进行评估。这种做法将新颖性限制在不同对象类型的刻板图像上。我们建议需要新的基准来代表开放世界的挑战。我们的新型NovelCraft数据集包含图像和符号世界的多模式情节数据,该数据由代理在视频游戏世界中完成POGO-Stick组装任务。在某些情节中,我们插入可能影响游戏玩法的新颖对象。新颖性在复杂场景中的大小,位置和遮挡可能会有所不同。我们基于最新的新颖性检测和广义类别发现模型,重点是全面评估。结果暗示了未来研究的机会:了解不同类型错误的特定任务成本的模型可以更有效地检测和适应开放世界中的新颖性。
translated by 谷歌翻译
机器可以学习机器学习吗?我们建议使用我们用来回答类似问题的相同标准回答这个问题:人类学习机器学习吗?我们在人类级别的机器学习介绍中自动回答麻省理工学院的期末考试。该课程是一个大型的本科课程,每个学期约有五百名学生。最近,计划合成和几乎没有学习的学习解决了大学级问题,在人类层面设定了数学和STEM课程的问题。在这项工作中,我们从期末考试中解决了与问题集不同的问题:问题更长,有多个部分,更复杂,并且跨越了更广泛的主题。我们在2017年秋季至2022年春季之间的八项麻省理工学院介绍最终考试中提供了一个新的数据集和基准,并提供了自动回答这些问题并产生新问题的代码。我们进行消融研究,比较零拍的学习与几乎没有的学习,经过思考链的提示,GPT-3在文本上进行了预训练,并且在一系列机器学习主题上进行了代码进行了微调,并发现了很少的照片学习方法表现最好。我们将数据和代码公开用于机器学习社区。
translated by 谷歌翻译