大型预训练的语言模型的最新进展表明,在生成自然语言以及许多自然语言生成(NLG)应用(例如机器翻译和文本摘要)的表现方面取得了良好的结果。但是,当一代任务更开放并且内容不足时,现有的技术难以生成长期连贯和创造性的内容。此外,这些模型表现出甚至扩大了从培训语料库中学到的社会偏见。之所以发生这种情况,是因为对生成模型进行了训练以捕获表面模式(即单词序列),而不是捕获基本的语义和话语结构以及包括社会规范在内的背景知识。在本文中,我介绍了有关可控文本生成的最新作品,以增强语言生成模型的创造力和公平性。我们探索层次结构的生成和限制解码,并应用于创意语言生成,包括故事,诗歌和象征性语言以及对生成模型的偏见缓解。
translated by 谷歌翻译
We consider the problem of automatically generating stories in multiple languages. Compared to prior work in monolingual story generation, crosslingual story generation allows for more universal research on story planning. We propose to use Prompting Large Language Models with Plans to study which plan is optimal for story generation. We consider 4 types of plans and systematically analyse how the outputs differ for different planning strategies. The study demonstrates that formulating the plans as question-answer pairs leads to more coherent generated stories while the plan gives more control to the story creators.
translated by 谷歌翻译
深度神经语言模型的最新进展与大规模数据集的能力相结合,加速了自然语言生成系统的发展,这些系统在多种任务和应用程序上下文中产生流利和连贯的文本(在各种成功程度上)。但是,为所需的用户控制这些模型的输出仍然是一个开放的挑战。这不仅对于自定义生成语言的内容和样式至关重要,而且对于他们在现实世界中的安全可靠部署至关重要。我们提出了一项关于受约束神经语言生成的新兴主题的广泛调查,在该主题中,我们通过区分条件和约束(后者是在输出文本上而不是输入的可检验条件),正式定义和分类自然语言生成问题,目前是可检验的)约束文本生成任务,并查看受限文本生成的现有方法和评估指标。我们的目的是强调这个新兴领域的最新进展和趋势,以告知最有希望的方向和局限性,以推动受约束神经语言生成研究的最新作品。
translated by 谷歌翻译
Controllable Text Generation (CTG) is emerging area in the field of natural language generation (NLG). It is regarded as crucial for the development of advanced text generation technologies that are more natural and better meet the specific constraints in practical applications. In recent years, methods using large-scale pre-trained language models (PLMs), in particular the widely used transformer-based PLMs, have become a new paradigm of NLG, allowing generation of more diverse and fluent text. However, due to the lower level of interpretability of deep neural networks, the controllability of these methods need to be guaranteed. To this end, controllable text generation using transformer-based PLMs has become a rapidly growing yet challenging new research hotspot. A diverse range of approaches have emerged in the recent 3-4 years, targeting different CTG tasks which may require different types of controlled constraints. In this paper, we present a systematic critical review on the common tasks, main approaches and evaluation methods in this area. Finally, we discuss the challenges that the field is facing, and put forward various promising future directions. To the best of our knowledge, this is the first survey paper to summarize CTG techniques from the perspective of PLMs. We hope it can help researchers in related fields to quickly track the academic frontier, providing them with a landscape of the area and a roadmap for future research.
translated by 谷歌翻译
Storytelling and narrative are fundamental to human experience, intertwined with our social and cultural engagement. As such, researchers have long attempted to create systems that can generate stories automatically. In recent years, powered by deep learning and massive data resources, automatic story generation has shown significant advances. However, considerable challenges, like the need for global coherence in generated stories, still hamper generative models from reaching the same storytelling ability as human narrators. To tackle these challenges, many studies seek to inject structured knowledge into the generation process, which is referred to as structure knowledge-enhanced story generation. Incorporating external knowledge can enhance the logical coherence among story events, achieve better knowledge grounding, and alleviate over-generalization and repetition problems in stories. This survey provides the latest and comprehensive review of this research field: (i) we present a systematical taxonomy regarding how existing methods integrate structured knowledge into story generation; (ii) we summarize involved story corpora, structured knowledge datasets, and evaluation metrics; (iii) we give multidimensional insights into the challenges of knowledge-enhanced story generation and cast light on promising directions for future study.
translated by 谷歌翻译
大型预训练的语言模型能够产生多种多样的文本。从提示开始,这些模型产生了一种可以不可预测的叙述。现有的可控文本生成方法,该方法指导用户指定方向的文本中的叙述,需要创建培训语料库和额外的耗时培训程序。本文提出并调查了Contocation2Text,这是一种用于俄罗斯自动可控文本生成的插件方法,不需要微调。该方法基于两个交互模型:自回归语言Rugpt-3模型和自动编码语言Ruroberta模型。该方法的想法是根据自动编码模型的输出分布将自回归模型的输出分布移动,以确保文本中叙事的连贯过渡向指南短语,其中可以包含单个单词或搭配。能够考虑到令牌的左和右下方的自动编码模型“告诉”“自动回归模型”在当前一代步骤中,该模型是令牌最不合逻辑的,从而增加或降低了相应令牌的概率。使用该方法生成新闻文章的实验显示了其对自动生成的流利文本的有效性,这些文本包含用户指定的短语之间的连贯过渡。
translated by 谷歌翻译
预训练的语言模型(PLM)无法生成长形式的叙事文本,因为它们不考虑全局结构。结果,生成的文本通常是不巧妙的,重复的或缺乏内容的。故事发电的最新工作以提示,关键字或语义框架的形式重新引入了明确的内容计划。经过大型平行语料库的培训,这些模型可以生成更合乎逻辑的事件序列,从而产生更满足的故事。但是,这些中间表示通常不使用自然语言,并且不需要微调就无法使用。我们建议使用现成的PLM生成故事情节,同时保持内容计划的好处,以产生凝聚力和满足的故事。我们提出的方法ScratchPlot首先提示PLM构成内容计划。然后,我们生成故事的身体并以内容计划结束。此外,我们通过使用其他PLM来对生成的(故事,结尾)对进行排名。我们用各种基线基准测试我们的方法,并在人类和自动评估中取得了卓越的结果。
translated by 谷歌翻译
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before. In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions; and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, machine translation, and visual-language generation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
translated by 谷歌翻译
Automated plot generation is the challenge of generating a sequence of events that will be perceived by readers as the plot of a coherent story. Traditional symbolic planners plan a story from a goal state and guarantee logical causal plot coherence but rely on a library of hand-crafted actions with their preconditions and effects. This closed world setting limits the length and diversity of what symbolic planners can generate. On the other hand, pre-trained neural language models can generate stories with great diversity, while being generally incapable of ending a story in a specified manner and can have trouble maintaining coherence. In this paper, we present an approach to story plot generation that unifies causal planning with neural language models. We propose to use commonsense knowledge extracted from large language models to recursively expand a story plot in a backward chaining fashion. Specifically, our system infers the preconditions for events in the story and then events that will cause those conditions to become true. We performed automatic evaluation to measure narrative coherence as indicated by the ability to answer questions about whether different events in the story are causally related to other events. Results indicate that our proposed method produces more coherent plotlines than several strong baselines.
translated by 谷歌翻译
Recent pre-trained language models have shown promising capabilities in generating fluent and realistic natural language text. However, generating multi-sentence text with global content planning has been a long-existing research question. Current approaches for controlled text generation can hardly address this issue, as they usually condition on single known control attributes. In this study, we propose a low-cost yet effective framework which explicitly models the global content plan of the generated text. Specifically, it optimizes the joint distribution of the natural language sequence and the global content plan in a plug-and-play manner. We conduct extensive experiments on the well-established Recipe1M+ benchmark. Both automatic and human evaluations verify that our model achieves the state-of-the-art performance on the task of recipe generation
translated by 谷歌翻译
Recent advances in deep learning research, such as transformers, have bolstered the ability for automated agents to generate creative texts similar to those that a human would write. By default, transformer decoders can only generate new text with respect to previously generated text. The output distribution of candidate tokens at any position is conditioned on previously selected tokens using a self-attention mechanism to emulate the property of autoregression. This is inherently limiting for tasks such as controllable story generation where it may be necessary to condition on future plot events when writing a story. In this work, we propose Future Sight, a method for finetuning a pretrained generative transformer on the task of future conditioning. Transformer decoders are typically pretrained on the task of completing a context, one token at a time, by means of self-attention. Future Sight additionally enables a decoder to attend to an encoded future plot event. This motivates the decoder to expand on the context in a way that logically concludes with the provided future. During inference, the future plot event can be written by a human author to steer the narrative being generated in a certain direction. We evaluate the efficacy of our approach on a story generation task with human evaluators.
translated by 谷歌翻译
The rapid development and application of natural language generation (NLG) techniques has revolutionized the field of automatic text production. However, these techniques are still limited in their ability to produce human-like text that is truly reasonable and informative. In this paper, we explore the importance of NLG being guided by knowledge, in order to convey human-like reasoning through language generation. We propose ten goals for intelligent NLG systems to pursue, and briefly review the achievement of NLG techniques guided by knowledge and reasoning. We also conclude by envisioning future directions and challenges in the pursuit of these goals.
translated by 谷歌翻译
自动化讲故事长期以来一直抓住了研究人员在日常生活中的叙述中的难以感受。但是,在用神经语言模型产生叙述时,保持一致性并保持对特定结束的特定结束挑战。在本文中,我们介绍了读者模型(Storm)的故事生成,这是一个框架,其中读者模型用于推理故事的推理应该进步。读者模型是人类读者相信关于虚构故事世界的概念,实体和关系的人。我们展示了如何作为知识图表所代表的明确读者模型提供故事一致性,并以实现给定的故事世界目标的形式提供可控性。实验表明,我们的模型产生了显着更加连贯和主题的故事,优于尺寸的基线,包括情节合理性并保持主题。我们的系统也优于在未订购的情况下在组成给定概念时占总引导的故事生成基线。
translated by 谷歌翻译
We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.
translated by 谷歌翻译
会话代理已成为简单任务允许情况的一般人群的组成部分。然而,这些系统尚未对各种和少数群体的任何社会影响,例如,帮助患有神经系统障碍的人,例如ALS和言语,语言和社交交流障碍的人。语言模型技术可以发挥巨大作用,以帮助这些用户进行日常沟通和社交互动。要启用此群体,我们构建了一个对话系统,可以使用CUES或关键字的用户控制。我们构建可以在用于控制响应生成的对话响应上下文中建立相关提示的模型,并可以加快通信。我们还介绍了一个关键字丢失来限制模型输出。我们在定性和定量上展示我们的模型可以有效地将关键字诱导到模型响应中,而不会降低响应的质量。在使用退行性障碍的人的使用情况的背景下,我们展示了对我们的提示或关键字预测器和可控对话系统的人类评估,并显示我们的模型比没有控制的模型更好地表现更好。我们的研究表明,在结束到结束响应生成模型的关键字控制是强大的,可以使用户能够与退行性疾病启用和赋予日常通信的日常沟通。
translated by 谷歌翻译
对事件序列的预测对于信息检索和自然语言处理中的许多现实世界应用至关重要。在事件序列预测中,未来的活动生成(FEG)是一项具有挑战性的任务,因为它不仅需要流利的文本生成,而且需要常识性推理才能保持整个事件故事的逻辑连贯性。在本文中,我们提出了一个新颖的可解释的FEG框架COEP。它突出并整合了两种类型的事件知识,对直接事件事件关系的顺序知识以及推论知识,这些知识反映了事件之间的中间角色心理学(例如意图,原因,反应),这些心理本质地将故事推向了故事。为了减轻知识遗忘问题,我们为每种类型的知识设计了两个模块,即IM和GM,它们是通过及时调整组合的。首先,IM专注于理解推论知识,以产生常识性解释并为通用汽车提供软提示向量。我们还设计了一种对比歧视器,以提高概括能力。其次,GM通过用IM的指导对直接顺序知识进行建模来生成未来事件。自动和人类评估表明,我们的方法可以产生更连贯,具体和逻辑的未来事件。
translated by 谷歌翻译
尽管在产生流利的文本方面取得了进步,但现有的预训练模型倾向于在产生诸如故事和新闻之类的叙述时将不连贯的事件序列附加到相关实体上。我们猜想,这些问题是由将实体表示为浅表词的静态嵌入而导致的,同时忽略了对其不断变化的状态建模,即随着文本的展开,即它们所携带的信息。因此,我们将变压器模型扩展到动态执行实体状态更新和叙事生成的句子实现。我们提出了一个对比框架,以在离散空间中学习状态表示,并将其他注意层插入解码器中以更好地利用这些状态。两个叙述数据集的实验表明,与有意义的实体状态的指导相比,我们的模型可以产生更多的连贯和多样化的叙事。
translated by 谷歌翻译
Language models (LMs) often generate incoherent outputs: they refer to events and entity states that are incompatible with the state of the world described in their inputs. We introduce SituationSupervision, a family of approaches for improving coherence in LMs by training them to construct and condition on explicit representations of entities and their states. SituationSupervision has two components: an auxiliary situation modeling task that trains models to predict state representations in context, and a latent state inference procedure that imputes these states from partially annotated training data. SituationSupervision can be applied to both fine-tuning (by supervising LMs to encode state variables in their hidden representations) and prompting (by inducing LMs to interleave textual descriptions of entity states with output text). In both cases, SituationSupervision requires only a small number of state annotations to produce major coherence improvements (between 4-11%), showing that standard LMs can be sample-efficiently trained to model not just language but the situations it describes.
translated by 谷歌翻译
诸如学术文章和商业报告之类的长期文件一直是详细说明重要问题和需要额外关注的复杂主题的标准格式。自动汇总系统可以有效地将长文档置于简短而简洁的文本中,以封装最重要的信息,从而在帮助读者的理解中很重要。最近,随着神经体系结构的出现,已经做出了重大的研究工作,以推动自动文本摘要系统,以及有关将这些系统扩展到长期文档领域的挑战的大量研究。在这项调查中,我们提供了有关长期文档摘要的研究的全面概述,以及其研究环境的三个主要组成部分的系统评估:基准数据集,汇总模型和评估指标。对于每个组成部分,我们在长期汇总的背景下组织文献,并进行经验分析,以扩大有关当前研究进度的观点。实证分析包括一项研究基准数据集的内在特征,摘要模型的多维分析以及摘要评估指标的综述。根据总体发现,我们通过提出可能在这个快速增长的领域中提出未来探索的方向来得出结论。
translated by 谷歌翻译
大型预先训练的生成语言模型的出现为AI故事的常见框架通过采样模型来创建持续故事的序列。然而,单独的抽样对故事产生不足。特别是,很难指导语言模型来创建故事以达到特定的目标事件。我们提出了两种在深增强学习和奖励塑造的自动化技术,以控制计算机生成的故事的情节。首先利用近端策略优化来微调现有的基于变换器的语言模型,以生成文本持续,而且是寻求目标。第二种提取来自展开故事的知识图,该故事由策略网络使用,具有图注意选择由语言模型生成的候选继续。我们报告了与故事如何实现给定的目标事件以及与基线和消融相比的一致性和整体故事质量的人类参与者排名的自动化指标报告。
translated by 谷歌翻译