在寻求信息的对话中,用户与代理商进行对话,以提出一系列通常可以不足或过度指定的问题。理想的代理商首先将通过搜索其基本知识来源,然后与用户进行适当互动以解决它,从而确定他们处于这种情况。但是,大多数现有研究都无法或人为地纳入此类代理端计划。在这项工作中,我们介绍了Inscit(发音为Insight),这是一种用于与混合互动相互作用的信息寻求对话的数据集。它包含从805个人类对话中进行的4.7k用户代理转弯,代理商对Wikipedia进行搜索,并要求澄清或提供相关信息以解决用户查询。我们定义了两个子任务,即证据通过识别和响应产生,以及一种新的人类评估协议来评估模型绩效。我们根据对话知识识别和开放域问题的最新模型报告了两个强大的基线的结果。这两种模型都显着不足,并且没有产生连贯和信息丰富的反应,这表明未来的研究有足够的改进空间。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
While 3D GANs have recently demonstrated the high-quality synthesis of multi-view consistent images and 3D shapes, they are mainly restricted to photo-realistic human portraits. This paper aims to extend 3D GANs to a different, but meaningful visual form: artistic portrait drawings. However, extending existing 3D GANs to drawings is challenging due to the inevitable geometric ambiguity present in drawings. To tackle this, we present Dr.3D, a novel adaptation approach that adapts an existing 3D GAN to artistic drawings. Dr.3D is equipped with three novel components to handle the geometric ambiguity: a deformation-aware 3D synthesis network, an alternating adaptation of pose estimation and image synthesis, and geometric priors. Experiments show that our approach can successfully adapt 3D GANs to drawings and enable multi-view consistent semantic editing of drawings.
translated by 谷歌翻译
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites.
translated by 谷歌翻译
我们介绍了新的新闻文章集合,该文章源自伪造和真实的新闻媒体来源,以分析和预测新闻病毒性。与现有的伪造新闻数据集不同,该数据集包含索赔或新闻文章的标题和正文,在此集合中,每篇文章都得到了Facebook参与数的支持,我们认为这是文章病毒性的指标。此外,我们还提供了文章说明和缩略图图像,与该文章在Facebook上共享。这些图像是用对象标签和颜色属性自动注释的。使用基于云的视觉分析工具,还分析了面部的缩略图图像,并用面部属性注释了检测到的面部。我们从经验上研究了该集合对文章病毒性预测的示例任务的使用。
translated by 谷歌翻译
图像修复是计算机视觉中的一项重要且具有挑战性的任务。将过滤的图像恢复到其原始图像有助于各种计算机视觉任务。我们采用非线性激活函数网络(NAFNET)进行快速且轻巧的模型,并添加色彩注意模块,以提取有用的颜色信息以提高精确度。我们提出了一个准确,快速,轻巧的网络,具有多尺度和色彩的关注,以进行Instagram滤波器删除(CAIR)。实验结果表明,所提出的CAIR以快速和轻巧的方式优于现有的Instagram滤波器删除网络,约11 $ \ times $快速$ \ times $和2.4 $ \ times $ ipher,而在IFFI数据集上超过3.69 db psnr。CAIR可以通过高质量成功地删除Instagram过滤器,并以定性结果恢复颜色信息。源代码和预处理的权重可在\ url {https://github.com/hnv-lab/cair}上获得。
translated by 谷歌翻译
扩散模型已显示出令人印象深刻的图像产生性能,并已用于各种计算机视觉任务。不幸的是,使用扩散模型的图像生成非常耗时,因为它需要数千个采样步骤。为了解决这个问题,我们在这里提出了一种新型的金字塔扩散模型,以使用训练有位置嵌入的单个分数函数从更粗的分辨率图像开始生成高分辨率图像。这使图像生成的时间效率抽样可以解决,并在资源有限的训练时也可以解决低批量的大小问题。此外,我们表明,使用单个分数函数可以有效地用于多尺度的超分辨率问题。
translated by 谷歌翻译
已知视觉问题答案(VQA)的任务受到VQA模型的问题的困扰,从而利用数据集中的偏见来做出最终预测。已经提出了许多先前基于合奏的偏数方法,其中有目的地训练了一个额外的模型以帮助训练强大的目标模型。但是,这些方法从训练数据的标签统计数据或直接从单局分支中计算出模型的偏差。相反,在这项工作中,为了更好地了解目标VQA模型的偏见,我们提出了一种生成方法来训练偏差模型\ emph {直接来自目标模型},称为GenB。特别是,GENB采用生成网络来通过对抗目标和知识蒸馏的结合来学习偏见。然后,我们将目标模型以GENB作为偏置模型为单位,并通过广泛的实验显示了我们方法对包括VQA CP2,VQA-CP1,VQA-CP1,GQA-OOD和VQA-CE在内的各种VQA偏置数据集的影响。
translated by 谷歌翻译
基于深度强化学习(DRL)的神经调度程序已经显示出巨大的解决现实世界资源分配问题的潜力,因为它们在集群计算领域表现出显着的性能增长。在本文中,我们通过广泛的实验和与非神经,启发式调度程序进行比较,调查了神经调度程序对芯片(SOC)资源分配的域(SOC)资源域的可行性。关键发现是三倍。首先,由于i)SOC计算资源的异质性和ii)由传入工作中的随机性引起的可变动作集,因此为群集计算域而设计的神经调度程序对SOC无法正常工作。其次,我们的新型神经调度程序技术,折衷的相互作用匹配(EIM)克服了上述挑战,从而显着改善了现有的神经调度程序。具体而言,我们合理化了基于EIM的神经调度程序的性能增长背后的根本原因。第三,我们发现平均处理元件(PE)切换延迟和平均PE计算时间的比率也会显着影响神经SOC调度程序的性能,即使使用EIM。因此,未来的神经SOC调度程序设计必须考虑该指标及其实施开销,以实施实用性。
translated by 谷歌翻译
量子神经网络在嘈杂的中间量子时代的广泛应用方面有希望。因此,对自动量子神经架构搜索的需求不断增长。我们通过设计高斯工艺的贝叶斯优化的量子电路指标来应对这一挑战。为了实现这一目标,我们提出了一个新的量子门距离,该距离距离,以每个量子状态的行动为特征,并就其几何特性提供理论观点。我们的方法极大地超过了三个经验量子机学习问题的基准,包括培训量子生成的对抗网络,在MaxCut问题中求解组合优化以及模拟量子傅立叶变换。我们的方法可以扩展以表征各种量子机学习模型的行为。
translated by 谷歌翻译