The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Diffusion-based generative models have achieved remarkable success in image generation. Their guidance formulation allows an external model to plug-and-play control the generation process for various tasks without fine-tuning the diffusion model. However, the direct use of publicly available off-the-shelf models for guidance fails due to their poor performance on noisy inputs. For that, the existing practice is to fine-tune the guidance models with labeled data corrupted with noises. In this paper, we argue that this practice has limitations in two aspects: (1) performing on inputs with extremely various noises is too hard for a single model; (2) collecting labeled datasets hinders scaling up for various tasks. To tackle the limitations, we propose a novel strategy that leverages multiple experts where each expert is specialized in a particular noise range and guides the reverse process at its corresponding timesteps. However, as it is infeasible to manage multiple networks and utilize labeled data, we present a practical guidance framework termed Practical Plug-And-Play (PPAP), which leverages parameter-efficient fine-tuning and data-free knowledge transfer. We exhaustively conduct ImageNet class conditional generation experiments to show that our method can successfully guide diffusion with small trainable parameters and no labeled data. Finally, we show that image classifiers, depth estimators, and semantic segmentation models can guide publicly available GLIDE through our framework in a plug-and-play manner.
translated by 谷歌翻译
Question Answering (QA) is a task that entails reasoning over natural language contexts, and many relevant works augment language models (LMs) with graph neural networks (GNNs) to encode the Knowledge Graph (KG) information. However, most existing GNN-based modules for QA do not take advantage of rich relational information of KGs and depend on limited information interaction between the LM and the KG. To address these issues, we propose Question Answering Transformer (QAT), which is designed to jointly reason over language and graphs with respect to entity relations in a unified manner. Specifically, QAT constructs Meta-Path tokens, which learn relation-centric embeddings based on diverse structural and semantic relations. Then, our Relation-Aware Self-Attention module comprehensively integrates different modalities via the Cross-Modal Relative Position Bias, which guides information exchange between relevant entities of different modalities. We validate the effectiveness of QAT on commonsense question answering datasets like CommonsenseQA and OpenBookQA, and on a medical question answering dataset, MedQA-USMLE. On all the datasets, our method achieves state-of-the-art performance. Our code is available at http://github.com/mlvlab/QAT.
translated by 谷歌翻译
占用映射已被广泛用于代表自动驾驶机器人的周围环境,以执行导航和操纵等任务。尽管在2D环境中进行了占用映射,但很少有适合3-D动态占用映射的方法,这对于空中机器人必不可少。本文提出了一种新颖的3-D动态占用映射算法,称为DSK3DOM。我们首先建立了一种贝叶斯方法,以基于随机有限集理论来依次更新占用图作为测量流。然后,我们用Dempster-Shafer域中的粒子近似它,以实现实时计算。此外,该算法将基于内核的推论与Dirichlet基本信念分配相关,以从稀疏测量中实现密集的映射。通过模拟和实际实验证明了所提出算法的功效。
translated by 谷歌翻译
ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列,该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战,这是由于探测器的几何形状,不均匀的散射和冰中光的吸收,并且低于100 GEV的光,每个事件产生的信号光子数量相对较少。为了应对这一挑战,可以将ICECUBE事件表示为点云图形,并将图形神经网络(GNN)作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开,对不同的中微子事件类型进行分类,并重建沉积的能量,方向和相互作用顶点。基于仿真,我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术,包括已知系统不确定性的影响。对于中微子事件分类,与当前的IceCube方法相比,GNN以固定的假阳性速率(FPR)提高了信号效率的18%。另外,GNN在固定信号效率下将FPR的降低超过8(低于半百分比)。对于能源,方向和相互作用顶点的重建,与当前最大似然技术相比,分辨率平均提高了13%-20%。当在GPU上运行时,GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件,这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。
translated by 谷歌翻译
鉴于大量的跨境流量,对行业的有效和有效控制对于保护人和社会免受非法行业的影响而在促进合法交易的同时变得更加重要。但是,交易级贸易数据集的有限可访问性阻碍了公开研究的进展,许多海关管理部门并未受益于基于数据的风险管理的最新进展。在本文中,我们介绍了一个进口声明数据集,以促进海关管理部门和数据科学研究人员领域专家之间的合作。该数据集包含54,000个具有22个关键属性的人为产生的交易,并且在维护相关功能的同时与CTGAN合成。合成数据具有多个优点。首先,释放数据集没有限制,这些限制不允许披露原始的导入数据。其次,制造步骤最大程度地减少了贸易统计中可能存在的身份风险。最后,已发布的数据遵循与源数据相似的分布,因此可以在各种下游任务中使用。通过提供数据及其生成过程,我们为欺诈检测任务打开基线代码,因为我们从经验上表明,更高级的算法可以更好地检测欺诈。
translated by 谷歌翻译
这封信建议基于分层成本图的多个自动移动机器人(AMR)的流量管理。多个AMR通过数据分发服务(DDS)进行通信,该数据由同一DDS域中的主题共享。每一层的成本都是由主题操纵的。域中的流量管理服务器将发送或接收到AMR的主题。使用分层成本图,提出并实施了新的禁令,车道过滤器,车队层和区域过滤器的概念。禁止过滤器可以帮助用户设置禁止AMR侵入的区域。车道滤波器可以根据角度图像帮助设置单向方向。车队层可以帮助AMR通过流量管理服务器共享其位置。该区域过滤器请求或接收一个独家区域,该区域只能由一个AMR占用,该区域可以从流量管理服务器中占据。所有层通过现实世界AMR在实验上验证。每个区域都可以使用用户定义的图像或基于文本的参数文件配置。
translated by 谷歌翻译
自我监督学习的共同研究目标是提取一般表示,任意下游任务将受益。在这项工作中,我们调查了从不同的对比度自学学习方案中学到的音乐音频表示形式,并在各种音乐信息检索(MIR)任务上对嵌入式矢量进行了经验评估,在这些任务中,音乐感知的不同级别。我们分析结果,以讨论针对不同MIR任务的对比度学习策略的正确方向。我们表明,这些表示形式传达了有关音乐一般的听觉特征的全面信息,尽管每种自学策略在信息的某些方面都有其自身的有效性。
translated by 谷歌翻译
通常,深度神经网络(DNN)是通过在训练阶段排除的未见数据测量的概括性能评估的。随着DNN的发展,概括性能会收敛到最新的,并且很难仅基于该指标评估DNN。对抗攻击的鲁棒性已被用作通过测量其脆弱性来评估DNN的额外指标。但是,很少有研究通过DNN中的几何形状来分析对抗性鲁棒性。在这项工作中,我们进行了一项实证研究,以分析影响对抗性攻击下模型鲁棒性的DNN的内部特性。特别是,我们提出了人口稠密区域集(PRS)的新颖概念,其中训练样本更频繁地代表在实际环境中DNN的内部特性。从对拟议概念进行的系统实验,我们提供了经验证据,以证明低PRS比与DNNS的对抗鲁棒性具有牢固的关系。我们还设计了PRS正常器利用PRS的特征来改善对抗性鲁棒性,而无需对抗训练。
translated by 谷歌翻译
在生物医学自然语言处理中,命名实体识别(NER)和命名实体归一化(NEN)是能够从不断增长的生物医学文献中自动提取生物医学实体(例如,疾病和化学品)的关键任务。在本文中,我们展示了伯尔尼(高级生物医学实体识别和归一化),这是一种改善以前的基于神经网络的NER工具的工具(Kim等,2019),采用多任务NER模型和基于神经网络的NEN模型实现更快,更准确的推理。我们希望我们的工具可以帮助为各种任务等诸如生物医学知识图形建设等各种任务来诠释大规模生物医学文本。
translated by 谷歌翻译