视频分类的视听广义零拍学习需要了解音频和视觉信息之间的关系,以便能够在测试时识别出新颖的,以前看不见的类别的样本。可以利用视频数据中音频和视觉数据之间的自然语义和时间对齐,以学习在测试时概括以概括为看不见类的强大表示。我们为音频概括的零拍学习提供了一个多模式和时间跨注意框架(\ modelname)。它的输入是从预先训练的网络获得的时间对齐音频和视觉功能。鼓励该框架专注于跨时间的跨模式对应关系,而不是在模式中的自我注意力,从而显着提高了表现。我们表明,我们提出的框架摄入时间功能会在\ ucf,\ vgg和\ \ \ \ \ \ \ \ \ vistion基准测试基准上获得最新的性能。复制所有结果的代码可在\ url {https://github.com/explainableml/tcaf-gzsl}上获得。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
We test grip strength and shock absorption properties of various granular material in granular jamming robotic components. The granular material comprises a range of natural, manufactured, and 3D printed material encompassing a wide range of shapes, sizes, and Shore hardness. Two main experiments are considered, both representing compelling use cases for granular jamming in soft robotics. The first experiment measures grip strength (retention force measured in Newtons) when we fill a latex balloon with the chosen grain type and use it as a granular jamming gripper to pick up a range of test objects. The second experiment measures shock absorption properties recorded by an Inertial Measurement Unit which is suspended in an envelope of granular material and dropped from a set height. Our results highlight a range of shape, size and softness effects, including that grain deformability is a key determinant of grip strength, and interestingly, that larger grain sizes in 3D printed grains create better shock absorbing materials.
translated by 谷歌翻译
For conceptual design, engineers rely on conventional iterative (often manual) techniques. Emerging parametric models facilitate design space exploration based on quantifiable performance metrics, yet remain time-consuming and computationally expensive. Pure optimisation methods, however, ignore qualitative aspects (e.g. aesthetics or construction methods). This paper provides a performance-driven design exploration framework to augment the human designer through a Conditional Variational Autoencoder (CVAE), which serves as forward performance predictor for given design features as well as an inverse design feature predictor conditioned on a set of performance requests. The CVAE is trained on 18'000 synthetically generated instances of a pedestrian bridge in Switzerland. Sensitivity analysis is employed for explainability and informing designers about (i) relations of the model between features and/or performances and (ii) structural improvements under user-defined objectives. A case study proved our framework's potential to serve as a future co-pilot for conceptual design studies of pedestrian bridges and beyond.
translated by 谷歌翻译
We describe the AGReE system, which takes user-submitted passages as input and automatically generates grammar practice exercises that can be completed while reading. Multiple-choice practice items are generated for a variety of different grammar constructs: punctuation, articles, conjunctions, pronouns, prepositions, verbs, and nouns. We also conducted a large-scale human evaluation with around 4,500 multiple-choice practice items. We notice for 95% of items, a majority of raters out of five were able to identify the correct answer and for 85% of cases, raters agree that there is only one correct answer among the choices. Finally, the error analysis shows that raters made the most mistakes for punctuation and conjunctions.
translated by 谷歌翻译
机器学习方法越来越广泛地用于医疗保健,运输和金融等高危环境中。在这些环境中,重要的是,模型要产生校准的不确定性以反映其自信并避免失败。在本文中,我们调查了有关深度学习的不确定性定量(UQ)的最新著作,特别是针对其数学属性和广泛适用性的无分配保形方法。我们将涵盖共形方法的理论保证,引入在时空数据的背景下提高UQ的校准和效率的技术,并讨论UQ在安全决策中的作用。
translated by 谷歌翻译
通过一系列联邦举措和命令,美国政府一直在努力确保美国在AI中的领导。这些广泛的战略文件影响了美国空军美国部(DAF)等组织。DAF-MIT AI加速器是DAF和MIT之间的一项计划,以弥合AI研究人员与DAF任务要求之间的差距。DAF-MIT AI加速器支持的几个项目正在开发公共挑战问题,这些问题解决了许多联邦AI研究的重点。这些挑战是通过公开可用的大型AI-Ready数据集,激励开源解决方案,并为可以激发进一步研究的双重使用技术创建需求信号,来针对优先事项。在本文中,我们描述了正在开发的这些公共挑战以及它们的应用如何促进科学进步。
translated by 谷歌翻译
胎儿镜检查激光​​光凝是一种广泛采用的方法,用于治疗双胞胎输血综合征(TTTS)。该过程涉及光凝病理吻合术以调节双胞胎之间的血液交换。由于观点有限,胎儿镜的可操作性差,可见性差和照明的可变性,因此该程序尤其具有挑战性。这些挑战可能导致手术时间增加和消融不完全。计算机辅助干预措施(CAI)可以通过识别场景中的关键结构并通过视频马赛克来扩展胎儿镜观景领域,从而为外科医生提供决策支持和背景意识。由于缺乏设计,开发和测试CAI算法的高质量数据,该领域的研究受到了阻碍。通过作为MICCAI2021内窥镜视觉挑战组织的胎儿镜胎盘胎盘分割和注册(FETREG2021)挑战,我们发布了第一个Largescale Multencentre TTTS数据集,用于开发广义和可靠的语义分割和视频摩擦质量algorithms。对于这一挑战,我们发布了一个2060张图像的数据集,该数据集是从18个体内TTTS胎儿镜检查程序和18个简短视频剪辑的船只,工具,胎儿和背景类别的像素通道。七个团队参与了这一挑战,他们的模型性能在一个看不见的测试数据集中评估了658个从6个胎儿镜程序和6个短剪辑的图像的图像。这项挑战为创建通用解决方案提供了用于胎儿镜面场景的理解和摩西式解决方案的机会。在本文中,我们介绍了FETREG2021挑战的发现,以及报告TTTS胎儿镜检查中CAI的详细文献综述。通过这一挑战,它的分析和多中心胎儿镜数据的发布,我们为该领域的未来研究提供了基准。
translated by 谷歌翻译
背景:荧光血管造影表现出非常有希望的结果,可以通过允许外科医生选择最佳灌注组织来减少吻合泄漏。但是,由于存在不同外科医生之间的显着差异,因此对荧光信号的主观解释仍然阻碍了该技术的广泛应用。我们的目的是开发一种人工智能算法,以基于术中荧光血管造影数据将结肠组织分类为“灌注”或“不灌注”。方法:在第三纪转介中心的荧光血管造影视频数据集中对具有重新结构结构的分类模型进行了训练。与结肠的荧光和非荧光段相对应的框架用于训练分类算法。进行了使用训练集未使用的患者的框架进行验证,包括使用相同的设备和使用其他相机收集的数据收集的数据。计算了性能指标,并用于进一步分析输出。根据组织分类确定了决策边界。结果:卷积神经网络已成功地对790名患者进行了1790帧的培训,并在14例患者的24帧中进行了验证。训练集的准确性为100%,验证集为80%。训练集的召回和精度分别为100%和100%,验证集分别为68.8%和91.7%。结论:具有高度准确性的术中荧光血管造影的自动分类是可能的,并且允许自动决策边界识别。这将使外科医生能够标准化荧光血管造影技术。基于Web的应用程序可用于部署该算法。
translated by 谷歌翻译
在Ultracold Atom实验中,数据通常以用于准备和测量系统的技术中固有的信息丢失的图像形式。当感兴趣的过程复杂时,这尤其成问题,例如Bose-Einstein缩合物中激发的相互作用(BECS)。在本文中,我们描述了一种与基于物理学的传统分析的机器学习(ML)模型的框架组合,以识别和跟踪BEC的图像中的多个Solitonic激发。我们使用基于ML的对象探测器来定位孤子激励并开发物理信息的分类器,将孤子激励分类为物理上积极的子类别。最后,我们介绍了一种质量指标量化特定特征是Kink Soliton的可能性。我们培训的此框架 - 焊接 - 焊接 - 被公开可作为开源Python包。焊接广泛适用于在合适的用户提供的数据集上培训时在寒冷原子图像中的特征识别。
translated by 谷歌翻译