In contrast to the control-theoretic methods, the lack of stability guarantee remains a significant problem for model-free reinforcement learning (RL) methods. Jointly learning a policy and a Lyapunov function has recently become a promising approach to ensuring the whole system with a stability guarantee. However, the classical Lyapunov constraints researchers introduced cannot stabilize the system during the sampling-based optimization. Therefore, we propose the Adaptive Stability Certification (ASC), making the system reach sampling-based stability. Because the ASC condition can search for the optimal policy heuristically, we design the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm based on the ASC condition. Meanwhile, our algorithm avoids the optimization problem that a variety of constraints are coupled into the objective in current approaches. When evaluated on ten robotic tasks, our method achieves lower accumulated cost and fewer stability constraint violations than previous studies.
translated by 谷歌翻译
ASR can be improved by multi-task learning (MTL) with domain enhancing or domain adversarial training, which are two opposite objectives with the aim to increase/decrease domain variance towards domain-aware/agnostic ASR, respectively. In this work, we study how to best apply these two opposite objectives with speaker labels to improve conformer-based ASR. We also propose a novel adaptive gradient reversal layer for stable and effective adversarial training without tuning effort. Detailed analysis and experimental verification are conducted to show the optimal positions in the ASR neural network (NN) to apply speaker enhancing and adversarial training. We also explore their combination for further improvement, achieving the same performance as i-vectors plus adversarial training. Our best speaker-based MTL achieves 7\% relative improvement on the Switchboard Hub5'00 set. We also investigate the effect of such speaker-based MTL w.r.t. cleaner dataset and weaker ASR NN.
translated by 谷歌翻译
蛋白质是人类生命的重要组成部分,其结构对于功能和机制分析很重要。最近的工作表明了AI驱动方法对蛋白质结构预测的潜力。但是,新模型的开发受到数据集和基准测试培训程序的限制。据我们所知,现有的开源数据集远不足以满足现代蛋白质序列相关研究的需求。为了解决这个问题,我们介绍了具有高覆盖率和多样性的第一个百万级蛋白质结构预测数据集,称为PSP。该数据集由570K真实结构序列(10TB)和745K互补蒸馏序列(15TB)组成。此外,我们还提供了该数据集上SOTA蛋白结构预测模型的基准测试训练程序。我们通过参与客串比赛验证该数据集的实用程序进行培训,我们的模特赢得了第一名。我们希望我们的PSP数据集以及培训基准能够为AI驱动的蛋白质相关研究提供更广泛的AI/生物学研究人员社区。
translated by 谷歌翻译
Existing techniques for training language models can be misaligned with the truth: if we train models with imitation learning, they may reproduce errors that humans make; if we train them to generate text that humans rate highly, they may output errors that human evaluators can't detect. We propose circumventing this issue by directly finding latent knowledge inside the internal activations of a language model in a purely unsupervised way. Specifically, we introduce a method for accurately answering yes-no questions given only unlabeled model activations. It works by finding a direction in activation space that satisfies logical consistency properties, such as that a statement and its negation have opposite truth values. We show that despite using no supervision and no model outputs, our method can recover diverse knowledge represented in large language models: across 6 models and 10 question-answering datasets, it outperforms zero-shot accuracy by 4\% on average. We also find that it cuts prompt sensitivity in half and continues to maintain high accuracy even when models are prompted to generate incorrect answers. Our results provide an initial step toward discovering what language models know, distinct from what they say, even when we don't have access to explicit ground truth labels.
translated by 谷歌翻译
Legal judgment Prediction (LJP), aiming to predict a judgment based on fact descriptions, serves as legal assistance to mitigate the great work burden of limited legal practitioners. Most existing methods apply various large-scale pre-trained language models (PLMs) finetuned in LJP tasks to obtain consistent improvements. However, we discover the fact that the state-of-the-art (SOTA) model makes judgment predictions according to wrong (or non-casual) information, which not only weakens the model's generalization capability but also results in severe social problems like discrimination. Here, we analyze the causal mechanism misleading the LJP model to learn the spurious correlations, and then propose a framework to guide the model to learn the underlying causality knowledge in the legal texts. Specifically, we first perform open information extraction (OIE) to refine the text having a high proportion of causal information, according to which we generate a new set of data. Then, we design a model learning the weights of the refined data and the raw data for LJP model training. The extensive experimental results show that our model is more generalizable and robust than the baselines and achieves a new SOTA performance on two commonly used legal-specific datasets.
translated by 谷歌翻译
Named entity recognition is a traditional task in natural language processing. In particular, nested entity recognition receives extensive attention for the widespread existence of the nesting scenario. The latest research migrates the well-established paradigm of set prediction in object detection to cope with entity nesting. However, the manual creation of query vectors, which fail to adapt to the rich semantic information in the context, limits these approaches. An end-to-end entity detection approach with proposer and regressor is presented in this paper to tackle the issues. First, the proposer utilizes the feature pyramid network to generate high-quality entity proposals. Then, the regressor refines the proposals for generating the final prediction. The model adopts encoder-only architecture and thus obtains the advantages of the richness of query semantics, high precision of entity localization, and easiness of model training. Moreover, we introduce the novel spatially modulated attention and progressive refinement for further improvement. Extensive experiments demonstrate that our model achieves advanced performance in flat and nested NER, achieving a new state-of-the-art F1 score of 80.74 on the GENIA dataset and 72.38 on the WeiboNER dataset.
translated by 谷歌翻译
使用相对比心脏磁共振成像(PC-CMR)进行的流量分析可以量化用于评估心血管功能的重要参数。该分析的重要部分是鉴定正确的CMR视图和质量控制(QC),以检测可能影响流量定量的伪像。我们提出了一个新型的基于深度学习的框架,用于对完整CMR扫描的流量进行完全自动化的分析,该框架首先使用两个顺序卷积神经网络进行这些视图选择和QC步骤,然后进行自动主动脉和肺动脉分段,以实现对量化的量化。钥匙流参数。对于观察分类和QC,获得了0.958和0.914的精度值。对于细分,骰子分数为$> $ 0.969,而平淡的altman情节表示手动和自动峰流量值之间的一致性很高。此外,我们在外部验证数据集上测试了管道,结果表明管道的鲁棒性。这项工作是使用由986例病例组成的多生临床数据进行的,表明在临床环境中使用该管道的潜力。
translated by 谷歌翻译
这项研究旨在实现两个目标:第一个目标是策划一个大型且信息丰富的数据集,其中包含有关球员的行动和位置的关键和简洁的摘要,以及在专业和NCAA中排球的来回旅行模式Div-i室内排球游戏。尽管几项先前的研究旨在为其他运动创建类似的数据集(例如羽毛球和足球),但尚未实现为室内排球创建这样的数据集。第二个目标是引入排球描述性语言,以充分描述游戏中的集会过程并将语言应用于我们的数据集。基于精选的数据集和我们的描述性运动语言,我们使用我们的数据集介绍了三项用于自动化排球行动和战术分析的任务:(1)排球拉力赛预测,旨在预测集会的结果,并帮助球员和教练改善决策制定决策在实践中,(2)设置类型和命中类型预测,以帮助教练和球员更有效地为游戏做准备,以及(3)排球策略和进攻区统计,以提供高级排球统计数据,并帮助教练了解游戏和对手的策略更好的。我们进行了案例研究,以展示实验结果如何为排球分析社区提供见解。此外,基于现实世界数据的实验评估为我们的数据集和语言的未来研究和应用建立了基准。这项研究弥合了室内排球场与计算机科学之间的差距。
translated by 谷歌翻译
基于能量功能的安全证书可以为复杂机器人系统的安全控制任务提供可证明的安全保证。但是,所有有关基于学习的能量功能合成的最新研究仅考虑可行性,这可能会导致过度保存并导致效率较低的控制器。在这项工作中,我们提出了幅度的正规化技术,以通过降低能量功能内部的保守性,同时保持有希望的可证明的安全保证,以提高安全控制器的效率。具体而言,我们通过能量函数的幅度来量化保守性,并通过在合成损失中增加幅度的正则化项来降低保守性。我们提出了使用加固学习(RL)进行合成的SAFEMR算法来统一安全控制器和能量功能的学习过程。实验结果表明,所提出的方法确实会降低能量功能的保守性,并在控制器效率方面优于基准,同时确保安全性。
translated by 谷歌翻译
编织的复合材料是通过隔板和纬纱以图案或编织方式进行的。通过更改图案或材料,可以显着改变编织复合材料的机械性能。但是,尚不清楚编织复合体系结构(图案,材料)在机械性能上的作用。在本文中,我们通过我们提出的物理受限的神经网络(PCNN)探讨了编织复合体系结构(编织模式,编织材料序列)与相应模量之间的关系。此外,我们采用统计学习方法来优化编织复合体系结构以改善机械响应。我们的结果表明,PCNN可以有效地预测所需模量的编织体系结构,其精度比几种基线模型高得多。 PCNN可以与基于功能的优化相结合,以确定初始设计阶段的最佳编织复合体系结构。除了将编织复合体系结构与其机械响应联系起来外,我们的研究还提供了对建筑特征如何控制机械响应的深入了解。我们预计我们提出的框架将主要促进编织的综合分析和优化过程,并成为将物理知识引导的神经网络引入复杂结构分析的起点。
translated by 谷歌翻译