智能论文笔记

Node-Element Hypergraph Message Passing for Fluid Dynamics Simulations

Rui Gao , Indu Kant Deo , Rajeev K. Jaiman

分类：机器学习

2022-12-30

A recent trend in deep learning research features the application of graph neural networks for mesh-based continuum mechanics simulations. Most of these frameworks operate on graphs in which each edge connects two nodes. Inspired by the data connectivity in the finite element method, we connect the nodes by elements rather than edges, effectively forming a hypergraph. We implement a message-passing network on such a node-element hypergraph and explore the capability of the network for the modeling of fluid flow. The network is tested on two common benchmark problems, namely the fluid flow around a circular cylinder and airfoil configurations. The results show that such a message-passing network defined on the node-element hypergraph is able to generate more stable and accurate temporal roll-out predictions compared to the baseline generalized message-passing network defined on a normal graph. Along with adjustments in activation function and training loss, we expect this work to set a new strong baseline for future explorations of mesh-based fluid simulations with graph neural networks.

translated by 谷歌翻译

Predicting waves in fluids with deep neural network

Indu Kant Deo , Rajeev Jaiman

分类：机器学习

2022-01-17

在本文中，我们提出了一种深度学习技术，用于数据驱动的流体介质中波传播的预测。该技术依赖于基于注意力的卷积复发自动编码器网络（AB-CRAN）。为了构建波传播数据的低维表示，我们采用了基于转化的卷积自动编码器。具有基于注意力的长期短期记忆细胞的AB-CRAN体系结构构成了我们的深度神经网络模型，用于游行低维特征的时间。我们评估了针对标准复发性神经网络的拟议的AB-Cran框架，用于波传播的低维学习。为了证明AB-Cran模型的有效性，我们考虑了三个基准问题，即一维线性对流，非线性粘性汉堡方程和二维圣人浅水系统。我们的新型AB-CRAN结构使用基准问题的空间 - 时空数据集，可以准确捕获波幅度，并在长期范围内保留溶液的波特性。与具有长期短期记忆细胞的标准复发性神经网络相比，基于注意力的序列到序列网络增加了预测的时间莫。 Denoising自动编码器进一步减少了预测的平方平方误差，并提高了参数空间中的概括能力。

translated by 谷歌翻译

Federated Learning Using Three-Operator ADMM

Shashi Kant , José Mairton B. da Silva Jr. , Gabor Fodor , Bo Göransson , Mats Bengtsson , Carlo Fischione

分类：机器学习

2022-11-08

Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler effects due to the limited computational resources of users' devices. A prominent approach to overcome such difficulties is FedADMM, which is based on the classical two-operator consensus alternating direction method of multipliers (ADMM). The common assumption of FL algorithms, including FedADMM, is that they learn a global model using data only on the users' side and not on the edge server. However, in edge learning, the server is expected to be near the base station and have direct access to rich datasets. In this paper, we argue that leveraging the rich data on the edge server is much more beneficial than utilizing only user datasets. Specifically, we show that the mere application of FL with an additional virtual user node representing the data on the edge server is inefficient. We propose FedTOP-ADMM, which generalizes FedADMM and is based on a three-operator ADMM-type technique that exploits a smooth cost function on the edge server to learn a global model parallel to the edge devices. Our numerical experiments indicate that FedTOP-ADMM has substantial gain up to 33\% in communication efficiency to reach a desired test accuracy with respect to FedADMM, including a virtual user on the edge server.

translated by 谷歌翻译

Improved Techniques for the Conditional Generative Augmentation of Clinical Audio Data

Mane Margaryan , Matthias Seibold , Indu Joshi , Mazda Farshad , Philipp Fürnstahl , Nassir Navab

分类：机器学习

2022-11-05

Data augmentation is a valuable tool for the design of deep learning systems to overcome data limitations and stabilize the training process. Especially in the medical domain, where the collection of large-scale data sets is challenging and expensive due to limited access to patient data, relevant environments, as well as strict regulations, community-curated large-scale public datasets, pretrained models, and advanced data augmentation methods are the main factors for developing reliable systems to improve patient care. However, for the development of medical acoustic sensing systems, an emerging field of research, the community lacks large-scale publicly available data sets and pretrained models. To address the problem of limited data, we propose a conditional generative adversarial neural network-based augmentation method which is able to synthesize mel spectrograms from a learned data distribution of a source data set. In contrast to previously proposed fully convolutional models, the proposed model implements residual Squeeze and Excitation modules in the generator architecture. We show that our method outperforms all classical audio augmentation techniques and previously published generative methods in terms of generated sample quality and a performance improvement of 2.84% of Macro F1-Score for a classifier trained on the augmented data set, an enhancement of $1.14\%$ in relation to previous work. By analyzing the correlation of intermediate feature spaces, we show that the residual Squeeze and Excitation modules help the model to reduce redundancy in the latent features. Therefore, the proposed model advances the state-of-the-art in the augmentation of clinical audio data and improves the data bottleneck for the design of clinical acoustic sensing systems.

translated by 谷歌翻译

Computer vision based vehicle tracking as a complementary and scalable approach to RFID tagging

Pranav Kant Gaur , Abhilash Bhardwaj , Pritam Shete , Mohini Laghate , Dinesh M Sarode

分类：计算机视觉

2022-09-13

传入/传出车辆的记录是根本原因分析的关键信息，以打击各种敏感组织中的安全违规事件。 RFID标记会阻碍物流和技术方面的车辆跟踪解决方案的可扩展性。例如，要求标记为RFID的每个传入车辆（部门或私人）是严重的限制，并且与RFID一起检测异常车辆运动的视频分析是不平凡的。我们利用公开可用的计算机视觉算法实现，使用有限状态机形式主义开发可解释的车辆跟踪算法。国家机器将用于状态转换的级联对象检测和光学特征识别（OCR）模型中的输入。我们从系统部署站点中评估了75个285辆车的视频片段中提出的方法。我们观察到检测率受速度和车辆类型的影响最大。当车辆运动仅限于在检查点类似于RFID标记的检查点时，将达到最高的检测率。我们进一步分析了700个对Live DATA的车辆跟踪预测，并确定大多数车辆数量预测误差是由于无法辨认的文本，图像布鲁尔，文本遮挡，文本遮挡和vecab外字母引起的。为了进行系统部署和性能增强，我们希望我们正在进行的系统监控能够提供证据，以在安全检查点上建立更高的车辆通知SOP，并将已部署的计算机视觉模型和状态模型的微调驱动为建立拟议的方法作为RFID标记的有希望的替代方法。

translated by 谷歌翻译

EGFR Mutation Prediction of Lung Biopsy Images using Deep Learning

Ravi Kant Gupta , Shivani Nandgaonkar , Nikhil Cherian Kurian , Swapnil Rane , Amit Sethi

分类：计算机视觉 | 人工智能 | 机器学习

2022-08-26

肺癌治疗中有针对性疗法的标准诊断程序涉及组织学亚型和随后检测关键驱动因素突变，例如EGFR。即使分子分析可以发现驱动器突变，但该过程通常很昂贵且耗时。深度学习的图像分析为直接从整个幻灯片图像（WSIS）直接发现驱动器突变提供了一种更经济的替代方法。在这项工作中，我们使用具有弱监督的自定义深度学习管道来鉴定苏木精和曙红染色的WSI的EGFR突变的形态相关性，此外还可以检测到肿瘤和组织学亚型。我们通过对两个肺癌数据集进行严格的实验和消融研究来证明管道的有效性-TCGA和来自印度的私人数据集。通过管道，我们在肿瘤检测下达到了曲线（AUC）的平均面积（AUC），在TCGA数据集上的腺癌和鳞状细胞癌之间的组织学亚型为0.942。对于EGFR检测，我们在TCGA数据集上的平均AUC为0.864，印度数据集的平均AUC为0.783。我们的关键学习点包括以下内容。首先，如果要在目标数据集中微调特征提取器，则使用对组织学训练的特征提取器层没有特别的优势。其次，选择具有较高细胞的斑块，大概是捕获肿瘤区域，并不总是有帮助的，因为疾病类别的迹象可能存在于肿瘤 - 肿瘤的基质中。

translated by 谷歌翻译

Synthetic Data in Human Analysis: A Survey

Indu Joshi , Marcel Grimmer , Christian Rathgeb , Christoph Busch , Francois Bremond , Antitza Dantcheva

分类：计算机视觉

2022-08-19

深度神经网络在人类分析中已经普遍存在，增强了应用的性能，例如生物识别识别，动作识别以及人重新识别。但是，此类网络的性能通过可用的培训数据缩放。在人类分析中，对大规模数据集的需求构成了严重的挑战，因为数据收集乏味，廉价，昂贵，并且必须遵守数据保护法。当前的研究研究了\ textit {合成数据}的生成，作为在现场收集真实数据的有效且具有隐私性的替代方案。这项调查介绍了基本定义和方法，在生成和采用合成数据进行人类分析时必不可少。我们进行了一项调查，总结了当前的最新方法以及使用合成数据的主要好处。我们还提供了公开可用的合成数据集和生成模型的概述。最后，我们讨论了该领域的局限性以及开放研究问题。这项调查旨在为人类分析领域的研究人员和从业人员提供。

translated by 谷歌翻译

Exploration of an End-to-End Automatic Number-plate Recognition neural network for Indian datasets

Sai Sirisha Nadiminti , Pranav Kant Gaur , Abhilash Bhardwaj

分类：计算机视觉

2022-07-14

印度车辆板在尺寸，字体，脚本和形状方面的种类繁多。因此，自动数板识别（ANPR）解决方案的开发是具有挑战性的，因此需要一个多样化的数据集作为示例集合。但是，缺少印度情景的全面数据集，从而阻碍了在公开可用和可重现的ANPR解决方案方面的进展。许多国家已经投入了努力，为中国和面向应用程序的车牌（AOLP）数据集开发诸如中国城市停车数据集（CCPD）等全面的ANPR数据集为我们提供了努力。在这项工作中，我们发布了一个扩展的数据集，该数据集目前由1.5K图像组成，以及可扩展且可重复的程序，以增强该数据集以开发印度条件的ANPR解决方案。我们利用此数据集探索了印度场景的端到端（E2E）ANPR体系结构，该架构最初是根据CCPD数据集为中国车辆号码板识别的。当我们为数据集定制体系结构时，我们遇到了见解，我们在本文中讨论了这一点。我们报告了CCPD作者提供的模型直接可重复使用性的障碍，因为印度数字板的极端多样性以及相对于CCPD数据集的分布差异。在将印度数据集的特性与中国数据集对齐后，在LP检测中观察到了42.86％的改善。在这项工作中，我们还将E2E数板检测模型的性能与Yolov5模型进行了比较，并在可可数据集上进行了预训练，并在印度车辆图像上进行了微调。鉴于用于微调检测模块和Yolov5的数量印度车辆图像是相同的，我们得出的结论是，基于COCO数据集而不是CCPD数据集开发针对印度条件的ANPR解决方案更有效。

translated by 谷歌翻译

LaTeRF: Label and Text Driven Object Radiance Fields

Ashkan Mirzaei , Yash Kant , Jonathan Kelly , Igor Gilitschenski

分类：计算机视觉

2022-07-04

获取3D对象表示对于创建照片现实的模拟器和为AR/VR应用程序收集资产很重要。神经领域已经显示出其在学习2D图像的场景的连续体积表示方面的有效性，但是从这些模型中获取对象表示，并以较弱的监督仍然是一个开放的挑战。在本文中，我们介绍了Laterf，一种从给定的2D图像和已知相机姿势的2D图像中提取感兴趣对象的方法，对象的自然语言描述以及少数对象和非对象标签 - 输入图像中的对象点。为了忠实地从场景中提取对象，后来在每个3D点上都以其他“对象”概率扩展NERF公式。此外，我们利用预先训练的剪辑模型与我们可区分的对象渲染器相结合的丰富潜在空间来注入对象的封闭部分。我们在合成数据集和真实数据集上展示了高保真对象提取，并通过广泛的消融研究证明我们的设计选择是合理的。

translated by 谷歌翻译

Not All Lotteries Are Made Equal

Surya Kant Sahu , Sai Mitheran , Somya Suhans Mahapatra

分类：机器学习

2022-06-16

彩票票证假设（LTH）指出，对于合理尺寸的神经网络，同一网络中的子网络的性能不如接受相同初始化训练时的密集对应。这项工作调查了模型大小与查找这些稀疏子网络的易用性之间的关系。我们通过实验表明，令人惊讶的是，在有限的预算下，较小的型号从票务搜索（TS）中受益更多。

translated by 谷歌翻译