缺失数据的归责是在许多工程和科学应用中发挥着重要作用的任务。通常,这种缺失的数据来自传感器的限制或后处理转换误差的实验观察中。其他时间从计算机模拟中的数值和算法约束产生。本文的一个这样的实例和应用重点是风暴浪涌的数值模拟。模拟数据对应于感兴趣的地理领域内的多个保存点的时间序列浪涌预测,创建了浪涌点在空间且时间上大量相关的时空呈现问题,并且缺失的值区域在结构上分布随机的。最近,已经开发了机器学习技术,例如神经网络方法,并用于缺少数据归档任务。生成的对抗网(GAN)和基于GAN的技术是特别引起了无监督机器学习方法的关注。在这项研究中,通过应用卷积神经网络而不是完全连接的层来改善生成的对抗性归纳网(增益)性能,以更好地捕获数据的相关性并从相邻的浪涌点促进学习。对所研究的数据所需的方法的另一调整是考虑点作为附加特征的点的坐标,以通过卷积层提供更多信息。我们将所提出的方法称为卷积生成的对抗性普通网(CONV-GAIL)。通过考虑风暴浪涌数据所需的改进和适应来评估和与原始增益和其他一些技术进行评估,提出的方法的表现。结果表明,CONV增益比研究数据上的替代方法具有更好的性能。
translated by 谷歌翻译
Modern speech recognition systems exhibits rapid performance degradation under domain shift. This issue is especially prevalent in data-scarce settings, such as low-resource languages, where diversity of training data is limited. In this work we propose M2DS2, a simple and sample-efficient finetuning strategy for large pretrained speech models, based on mixed source and target domain self-supervision. We find that including source domain self-supervision stabilizes training and avoids mode collapse of the latent representations. For evaluation, we collect HParl, a $120$ hour speech corpus for Greek, consisting of plenary sessions in the Greek Parliament. We merge HParl with two popular Greek corpora to create GREC-MD, a test-bed for multi-domain evaluation of Greek ASR systems. In our experiments we find that, while other Unsupervised Domain Adaptation baselines fail in this resource-constrained environment, M2DS2 yields significant improvements for cross-domain adaptation, even when a only a few hours of in-domain audio are available. When we relax the problem in a weakly supervised setting, we find that independent adaptation for audio using M2DS2 and language using simple LM augmentation techniques is particularly effective, yielding word error rates comparable to the fully supervised baselines.
translated by 谷歌翻译
The evolution of wireless communications into 6G and beyond is expected to rely on new machine learning (ML)-based capabilities. These can enable proactive decisions and actions from wireless-network components to sustain quality-of-service (QoS) and user experience. Moreover, new use cases in the area of vehicular and industrial communications will emerge. Specifically in the area of vehicle communication, vehicle-to-everything (V2X) schemes will benefit strongly from such advances. With this in mind, we have conducted a detailed measurement campaign with the purpose of enabling a plethora of diverse ML-based studies. The resulting datasets offer GPS-located wireless measurements across diverse urban environments for both cellular (with two different operators) and sidelink radio access technologies, thus enabling a variety of different studies towards V2X. The datasets are labeled and sampled with a high time resolution. Furthermore, we make the data publicly available with all the necessary information to support the on-boarding of new researchers. We provide an initial analysis of the data showing some of the challenges that ML needs to overcome and the features that ML can leverage, as well as some hints at potential research studies.
translated by 谷歌翻译
Modern Deep Learning (DL) models have grown to sizes requiring massive clusters of specialized, high-end nodes to train. Designing such clusters to maximize both performance and utilization to amortize their steep cost is a challenging task requiring careful balance of compute, memory, and network resources. Moreover, a plethora of each model's tuning knobs drastically affect the performance, with optimal values often depending on the underlying cluster's characteristics, which necessitates a complex cluster-workload co-design process. To facilitate the design space exploration of such massive DL training clusters, we introduce COMET a holistic cluster design methodology and workflow to jointly study the impact of parallelization strategies and key cluster resource provisioning on the performance of distributed DL training. We develop a step-by-step process to establish a reusable and flexible methodology, and demonstrate its application with a case study of training a Transformer-1T model on a cluster of variable compute, memory, and network resources. Our case study demonstrates COMET's utility in identifying promising architectural optimization directions and guiding system designers in configuring key model and cluster parameters.
translated by 谷歌翻译
X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related image processing applications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.
translated by 谷歌翻译
现代设备(例如智能手机,卫星和医疗设备)中的摄像机能够捕获非常高分辨率的图像和视频。这种高分辨率数据通常需要通过深度学习模型来处理癌症检测,自动化道路导航,天气预测,监视,优化农业过程和许多其他应用。使用高分辨率的图像和视频作为深度学习模型的直接输入,由于其参数数量大,计算成本,推理延迟和GPU内存消耗而造成了许多挑战。简单的方法(例如将图像调整为较低的分辨率大小)在文献中很常见,但是它们通常会显着降低准确性。文献中的几项作品提出了更好的替代方案,以应对高分辨率数据的挑战并提高准确性和速度,同时遵守硬件限制和时间限制。这项调查描述了这种高效的高分辨率深度学习方法,总结了高分辨率深度学习的现实应用程序,并提供了有关可用高分辨率数据集的全面信息。
translated by 谷歌翻译
以任务为导向的对话系统通常采用对话状态跟踪器(DST)成功完成对话。最近的最新DST实现依赖于各种服务的模式来改善模型的鲁棒性并处理对新域的零击概括[1],但是这种方法[2,3]通常需要多个大型变压器模型和长时间输入序列以表现良好。我们提出了一个基于多任务BERT的单个模型,该模型共同解决了意图预测的三个DST任务,请求的插槽预测和插槽填充。此外,我们提出了对对话历史和服务模式的高效和简约编码,该编码被证明可以进一步提高性能。对SGD数据集的评估表明,我们的方法的表现优于基线SGP-DST,比最新的方法相比表现良好,同时在计算上的效率更高。进行了广泛的消融研究,以检查我们模型成功的促成因素。
translated by 谷歌翻译
通常通过过去的选择来告知机器学习中的评估,例如要使用哪些数据集或指标。该标准化可以使用排行榜对平等基础进行比较,但是随着出现更好的替代方案,评估选择变得不佳。这个问题在自然语言生成中尤其相关,该语言需要不断改善的数据集,指标和人类评估以提出确定性的主张。为了使遵循最佳模型评估实践更加容易,我们介绍了GEMV2。新版本的一代,评估和指标基准为数据集,模型和指标开发人员提供了模块化基础架构,以使彼此受益。GEMV2支持40种记录的数据集中51种语言。所有数据集的模型都可以在线评估,我们的交互式数据卡创建和渲染工具使得在Living Benchmark中添加新数据集变得更加容易。
translated by 谷歌翻译
自动生物医学图像分析的领域至关重要地取决于算法验证的可靠和有意义的性能指标。但是,当前的度量使用通常是不明智的,并且不能反映基本的域名。在这里,我们提出了一个全面的框架,该框架指导研究人员以问题意识的方式选择绩效指标。具体而言,我们专注于生物医学图像分析问题,这些问题可以解释为图像,对象或像素级别的分类任务。该框架首先编译域兴趣 - 目标结构 - ,数据集和算法与输出问题相关的属性的属性与问题指纹相关,同时还将其映射到适当的问题类别,即图像级分类,语义分段,实例,实例细分或对象检测。然后,它指导用户选择和应用一组适当的验证指标的过程,同时使他们意识到与个人选择相关的潜在陷阱。在本文中,我们描述了指标重新加载推荐框架的当前状态,目的是从图像分析社区获得建设性的反馈。当前版本是在由60多个图像分析专家的国际联盟中开发的,将在社区驱动的优化之后公开作为用户友好的工具包提供。
translated by 谷歌翻译
对于推荐系统来说,长期存在的数据稀疏性和冷启动构成了棘手和困惑的问题。通过利用来自多个领域的信息来利用信息,已利用跨域建议作为域适应框架有效解决这些具有挑战性的问题。在这项研究中,探索了项目级相关性跨域建议任务,其中两个相关域,即源和目标域包含常见项目,而无需共享有关用户行为的敏感信息,从而避免了泄漏用户隐私。鉴于这种情况,提出了两种基于自动编码器的新型自动编码器的深度学习方法,以供跨域推荐。第一种方法旨在同时学习一对自动编码器,以揭示源和目标域中项目的内在表示,以及一个耦合的映射函数,以建模这些表示形式之间的非线性关系,从而将有益信息从目标域的源。第二种方法是基于新的联合正规化优化问题得出的,该问题采用了两个自动编码器以深层和非线性的方式生成用户和项目局限性因素,同时也学会了数据驱动的功能来映射跨域的项目范围因素。与几个最先进的跨域推荐框架相比,对两个公开基准数据集进行了大量的数值实验,说明了我们提出的方法的出色性能。
translated by 谷歌翻译