The task of automatic text summarization produces a concise and fluent text summary while preserving key information and overall meaning. Recent approaches to document-level summarization have seen significant improvements in recent years by using models based on the Transformer architecture. However, the quadratic memory and time complexities with respect to the sequence length make them very expensive to use, especially with long sequences, as required by document-level summarization. Our work addresses the problem of document-level summarization by studying how efficient Transformer techniques can be used to improve the automatic summarization of very long texts. In particular, we will use the arXiv dataset, consisting of several scientific papers and the corresponding abstracts, as baselines for this work. Then, we propose a novel retrieval-enhanced approach based on the architecture which reduces the cost of generating a summary of the entire document by processing smaller chunks. The results were below the baselines but suggest a more efficient memory a consumption and truthfulness.
translated by 谷歌翻译
Current abstractive summarization systems present important weaknesses which prevent their deployment in real-world applications, such as the omission of relevant information and the generation of factual inconsistencies (also known as hallucinations). At the same time, automatic evaluation metrics such as CTC scores have been recently proposed that exhibit a higher correlation with human judgments than traditional lexical-overlap metrics such as ROUGE. In this work, we intend to close the loop by leveraging the recent advances in summarization metrics to create quality-aware abstractive summarizers. Namely, we propose an energy-based model that learns to re-rank summaries according to one or a combination of these metrics. We experiment using several metrics to train our energy-based re-ranker and show that it consistently improves the scores achieved by the predicted summaries. Nonetheless, human evaluation results show that the re-ranking approach should be used with care for highly abstractive summaries, as the available metrics are not yet sufficiently reliable for this purpose.
translated by 谷歌翻译
我们展示了具有高斯流程先验的非线性回归模型中产生的高维单模式后分布的示例后措施浓缩。基于梯度或随机步行步骤,对一般MCMC方案的反示例持有,该理论用于大都市 - 危机调整后的方法,例如PCN和MALA。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
为了促进5G机器学习的使用,国际电信联盟(ITU)在2021年提议的第二版是5G挑战中ITU AI/ML的第二版,来自82个国家/地区的1600多名参与者。这项工作详细介绍了第二位解决方案总体上,这也是图形神经网络挑战2021的获胜解决方案。我们在将模型应用于5G网络时解决了概括问题,该模型可能比观察到的途径更长,链路容量更长且链接能力更大在培训中。为了实现这一目标,我们建议首先提取与排队理论(QT)相关的强大特征,然后使用Routenet Graph神经网络(GNN)模型的修改对分析基线预测进行微调。所提出的解决方案比简单地使用Routenet更好地概括了,并设法将分析基线的10.42平均绝对百分比误差降低到1.45(合奏为1.27)。这表明,对已知鲁棒的近似模型进行小更改可能是提高准确性的有效方法,而不会损害概括。
translated by 谷歌翻译
背景:机器学习(ML)可以实现有效的自动测试生成。目的:我们表征了新兴研究,检查测试实践,研究人员目标,应用的ML技术,评估和挑战。方法:我们对97个出版物的样本进行系统文献综述。结果:ML生成系统,GUI,单位,性能和组合测试的输入或改善现有生成方法的性能。 ML还用于生成测试判决,基于属性的和预期的输出序列。经常基于神经网络和强化学习的监督学习通常是基于Q学习的 - 很普遍,并且某些出版物还采用了无监督或半监督的学习。使用传统的测试指标和与ML相关的指标(例如准确性)评估(半/非 - )监督方法,而经常使用与奖励功能相关的测试指标来评估强化学习。结论:工作到尽头表现出巨大的希望,但是在培训数据,再探术,可伸缩性,评估复杂性,所采用的ML算法以及如何应用 - 基准和可复制性方面存在公开挑战。我们的发现可以作为该领域研究人员的路线图和灵感。
translated by 谷歌翻译
It would be useful for machines to use computers as humans do so that they can aid us in everyday tasks. This is a setting in which there is also the potential to leverage large-scale expert demonstrations and human judgements of interactive behaviour, which are two ingredients that have driven much recent success in AI. Here we investigate the setting of computer control using keyboard and mouse, with goals specified via natural language. Instead of focusing on hand-designed curricula and specialized action spaces, we focus on developing a scalable method centered on reinforcement learning combined with behavioural priors informed by actual human-computer interactions. We achieve state-of-the-art and human-level mean performance across all tasks within the MiniWob++ benchmark, a challenging suite of computer control problems, and find strong evidence of cross-task transfer. These results demonstrate the usefulness of a unified human-agent interface when training machines to use computers. Altogether our results suggest a formula for achieving competency beyond MiniWob++ and towards controlling computers, in general, as a human would.
translated by 谷歌翻译
半监督学习得到了研究人员的关注,因为它允许其中利用未标记数据的结构来实现比监督方法更少的标签来实现竞争分类结果。本地和全局一致性(LGC)算法是最着名的基于图形的半监督(GSSL)分类器之一。值得注意的是,其解决方案可以写成已知标签的线性组合。这种线性组合的系数取决于参数$ \ alpha $,在随机步行中达到标记的顶点时,确定随时间的衰减。在这项工作中,我们讨论如何删除标记实例的自我影响可能是有益的,以及它如何与休留次误差。此外,我们建议尽量减少自动分化的休假。在此框架内,我们提出了估计标签可靠性和扩散速率的方法。优化扩散速率以频谱表示更有效地完成。结果表明,标签可靠性方法与强大的L1-NORM方法竞争,删除对角线条目会降低过度的风险,并导致参数选择的合适标准。
translated by 谷歌翻译
我们研究了恢复单位 - 总稀疏主组件$ x \ in \ mathbb {r}^n $在随机矩阵中种植的计算成本,以wigner或wishart尖峰模型(观察$ w + \ lambda xx xx^xx^ \ top $带有从高斯正交集合中绘制的$ w $,或分别来自$ \ Mathcal {n}(0,i_n + \ beta xx^\ top)$的$ n $独立样本,分别为$)。先前的工作表明,当信噪比(分别$ \ lambda $或$ \ beta \ sqrt {n/n} $)是一个小常数,而种植向量中的非零入口的分数为$ \ \ \ | x \ | _0 / n = \ rho $,如果$ \ rho \ sillsim 1 / \ sqrt {n} $,可以在多项式时间内恢复$ x $。虽然可以在较弱的条件下以$ \ rho \ ll 1 $恢复指数时间的$ x $,但据信,除非$ \ rho \ rho \ simsim 1/\ sqrt {n} $,否则不可能多项式时间恢复。我们研究了“可能但难”制度中恢复所需的精确时间,$ 1/\ sqrt {n} \ ll \ ll \ rho \ ll 1 $通过探索次指定时间算法的功能,即,在时间$中运行的算法$ \ exp(n^\ delta)$对于某些常数$ \ delta \ in(0,1)$。对于任何$ 1/\ sqrt {n} \ ll \ rho \ ll 1 $,我们给出了一个恢复算法的运行时大约$ \ exp(\ rho^2 n)$,表明了稀疏和runtime之间的平稳折衷。我们的算法家族在两种现有算法之间平稳地插入:多项式时间对角线阈值算法和$ \ exp(\ rho n)$ - 时间详尽的搜索算法。此外,通过分析低度的似然比,我们提供了严格的证据,表明我们算法实现的权衡是最佳的。
translated by 谷歌翻译