智能论文笔记

Stag hunt game-based approach for cooperative UAVs

L. V. Nguyen , I. Torres Herrera , T. H. Le , M. D. Phung , R. P. Aguilera , Q. P. Ha

分类：机器人

2022-08-29

无人驾驶汽车（UAV）在许多领域都受雇于摄影，紧急，娱乐，国防，农业，林业，采矿和建筑。在过去的十年中，无人机技术在许多施工项目阶段中找到了应用程序，从现场映射，进度监控，建筑物检查，损坏评估和材料交付等等。尽管已经对无人机在各种施工相关的过程中的优势进行了广泛的研究，但关于提高任务能力和效率的无人机协作的研究仍然很少。本文提出了一种基于塔格狩猎游戏和粒子群优化（PSO）的多个无人机的新合作路径计划算法。首先，定义了每个无人机的成本函数，并包含多个目标和约束。然后，开发了无人机游戏框架，以将多功能路径计划制定到寻找回报优势均衡的问题。接下来，提出了基于PSO的算法来获得无人机的最佳路径。由三个无人机检查的大型建筑工地的仿真结果表明，在检查任务期间，提出的算法在为无人机形成的可行和高效飞行路径生成可行，高效的飞行路径上的有效性。

translated by 谷歌翻译

Advancing Brain Metastases Detection in T1-Weighted Contrast-Enhanced 3D MRI using Noisy Student-based Training

Engin Dikici , Xuan V. Nguyen , Matthew Bigelow , John. L. Ryu , Luciano M. Prevedello

分类：计算机视觉

2021-11-10

他们早期阶段的脑转移（BM）的检测可能对癌症患者的结果产生积极影响。我们以前开发了一种在T1加权对比度增强3D磁共振图像（T1C）中检测小BM（直径小于15mm）的框架，以帮助医学专家在这次时间敏感和高赌注任务中。该框架利用使用标记的T1C数据训练的专用卷积神经网络（CNN），其中基本真理BM分段由放射科医师提供。本研究旨在通过嘈杂的基于学生的自我培训策略推进框架，以利用未标记的T1C数据的大语料库（即，没有BM分段或检测的数据）。因此，工作（1）描述了学生和教师CNN架构，（2）提出数据和模型通知机制，（3）在框架的学习BM检测灵敏度中介绍了一种新的伪标记策略分解。最后，它描述了利用这些组件的半监督学习策略。我们通过2倍交叉验证使用标记为217和1247个未标记的T1C考试进行验证。仅使用标记的考试的框架产生了9.23个假阳性90％BM检测灵敏度;然而，使用所引入的学习策略的框架导致了相同的灵敏度水平的假检测（即8.44）减少了〜9％。此外，虽然利用75％和50％标记数据集的实验导致算法性能降级（分别为12.19和13.89误），但随着基于嘈杂的学生的培训策略（分别为10.79和12.37误报），影响不太明显。

translated by 谷歌翻译

Semantically-consistent Landsat 8 image to Sentinel-2 image translation for alpine areas

M. Sokolov , J. L. Storie , C. J. Henry , C. D. Storie , J. Cameron , R. S. Ødegård , V. Zubinaite , S. Stikbakke

分类：计算机视觉 | 机器学习

2022-12-22

The availability of frequent and cost-free satellite images is in growing demand in the research world. Such satellite constellations as Landsat 8 and Sentinel-2 provide a massive amount of valuable data daily. However, the discrepancy in the sensors' characteristics of these satellites makes it senseless to use a segmentation model trained on either dataset and applied to another, which is why domain adaptation techniques have recently become an active research area in remote sensing. In this paper, an experiment of domain adaptation through style-transferring is conducted using the HRSemI2I model to narrow the sensor discrepancy between Landsat 8 and Sentinel-2. This paper's main contribution is analyzing the expediency of that approach by comparing the results of segmentation using domain-adapted images with those without adaptation. The HRSemI2I model, adjusted to work with 6-band imagery, shows significant intersection-over-union performance improvement for both mean and per class metrics. A second contribution is providing different schemes of generalization between two label schemes - NALCMS 2015 and CORINE. The first scheme is standardization through higher-level land cover classes, and the second is through harmonization validation in the field.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

VISEM-Tracking: Human Spermatozoa Tracking Dataset

Vajira Thambawita , Steven A. Hicks , Andrea M. Storås , Thu Nguyen , Jorunn M. Andersen , Oliwia Witczak , Trine B. Haugen , Hugo L. Hammer , Pål Halvorsen , Michael A. Riegler

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-06

Manually analyzing spermatozoa is a tremendous task for biologists due to the many fast-moving spermatozoa, causing inconsistencies in the quality of the assessments. Therefore, computer-assisted sperm analysis (CASA) has become a popular solution. Despite this, more data is needed to train supervised machine learning approaches in order to improve accuracy and reliability. In this regard, we provide a dataset called VISEM-Tracking with 20 video recordings of 30s of spermatozoa with manually annotated bounding-box coordinates and a set of sperm characteristics analyzed by experts in the domain. VISEM-Tracking is an extension of the previously published VISEM dataset. In addition to the annotated data, we provide unlabeled video clips for easy-to-use access and analysis of the data. As part of this paper, we present baseline sperm detection performances using the YOLOv5 deep learning model trained on the VISEM-Tracking dataset. As a result, the dataset can be used to train complex deep-learning models to analyze spermatozoa. The dataset is publicly available at https://zenodo.org/record/7293726.

translated by 谷歌翻译

Impact of Automatic Image Classification and Blind Deconvolution in Improving Text Detection Performance of the CRAFT Algorithm

Clarisa V. Albarillo , Proceso L. Fernandez Jr

分类：计算机视觉 | 机器学习

2022-11-29

Text detection in natural scenes has been a significant and active research subject in computer vision and document analysis because of its wide range of applications as evidenced by the emergence of the Robust Reading Competition. One of the algorithms which has good text detection performance in the said competition is the Character Region Awareness for Text Detection (CRAFT). Employing the ICDAR 2013 dataset, this study investigates the impact of automatic image classification and blind deconvolution as image pre-processing steps to further enhance the text detection performance of CRAFT. The proposed technique automatically classifies the scene images into two categories, blurry and non-blurry, by utilizing of a Laplacian operator with 100 as threshold. Prior to applying the CRAFT algorithm, images that are categorized as blurry are further pre-processed using blind deconvolution to reduce the blur. The results revealed that the proposed method significantly enhanced the detection performance of CRAFT, as demonstrated by its IoU h-mean of 94.47% compared to the original 91.42% h-mean of CRAFT and this even outperformed the top-ranked SenseTime, whose h-mean is 93.62%.

translated by 谷歌翻译

A Solution for a Fundamental Problem of 3D Inference based on 2D Representations

Thien An L. Nguyen

分类：计算机视觉

2022-11-09

3D inference from monocular vision using neural networks is an important research area of computer vision. Applications of the research area are various with many proposed solutions and have shown remarkable performance. Although many efforts have been invested, there are still unanswered questions, some of which are fundamental. In this paper, I discuss a problem that I hope will come to be known as a generalization of the Blind Perspective-n-Point (Blind PnP) problem for object-driven 3D inference based on 2D representations. The vital difference between the fundamental problem and the Blind PnP problem is that 3D inference parameters in the fundamental problem are attached directly to 3D points and the camera concept will be represented through the sharing of the parameters of these points. By providing an explainable and robust gradient-decent solution based on 2D representations for an important special case of the problem, the paper opens up a new approach for using available information-based learning methods to solve problems related to 3D object pose estimation from 2D images.

translated by 谷歌翻译

An Incremental Phase Mapping Approach for X-ray Diffraction Patterns using Binary Peak Representations

Dipendra Jha , K. V. L. V. Narayanachari , Ruifeng Zhang , Justin Liao , Denis T. Keane , Wei-keng Liao , Alok Choudhary , Yip-Wah Chung , Michael Bedzyk , Ankit Agrawal

分类：机器学习 | 计算机视觉

2022-11-08

Despite the huge advancement in knowledge discovery and data mining techniques, the X-ray diffraction (XRD) analysis process has mostly remained untouched and still involves manual investigation, comparison, and verification. Due to the large volume of XRD samples from high-throughput XRD experiments, it has become impossible for domain scientists to process them manually. Recently, they have started leveraging standard clustering techniques, to reduce the XRD pattern representations requiring manual efforts for labeling and verification. Nevertheless, these standard clustering techniques do not handle problem-specific aspects such as peak shifting, adjacent peaks, background noise, and mixed phases; hence, resulting in incorrect composition-phase diagrams that complicate further steps. Here, we leverage data mining techniques along with domain expertise to handle these issues. In this paper, we introduce an incremental phase mapping approach based on binary peak representations using a new threshold based fuzzy dissimilarity measure. The proposed approach first applies an incremental phase computation algorithm on discrete binary peak representation of XRD samples, followed by hierarchical clustering or manual merging of similar pure phases to obtain the final composition-phase diagram. We evaluate our method on the composition space of two ternary alloy systems- Co-Ni-Ta and Co-Ti-Ta. Our results are verified by domain scientists and closely resembles the manually computed ground-truth composition-phase diagrams. The proposed approach takes us closer towards achieving the goal of complete end-to-end automated XRD analysis.

translated by 谷歌翻译

1-D Convolutional Graph Convolutional Networks for Fault Detection in Distributed Energy Systems

Bang L. H. Nguyen , Tuyen Vu , Thai-Thanh Nguyen , Mayank Panwar , Rob Hovsapian

分类：机器学习

2022-11-05

This paper presents a 1-D convolutional graph neural network for fault detection in microgrids. The combination of 1-D convolutional neural networks (1D-CNN) and graph convolutional networks (GCN) helps extract both spatial-temporal correlations from the voltage measurements in microgrids. The fault detection scheme includes fault event detection, fault type and phase classification, and fault location. There are five neural network model training to handle these tasks. Transfer learning and fine-tuning are applied to reduce training efforts. The combined recurrent graph convolutional neural networks (1D-CGCN) is compared with the traditional ANN structure on the Potsdam 13-bus microgrid dataset. The achievable accuracy of 99.27%, 98.1%, 98.75%, and 95.6% for fault detection, fault type classification, fault phase identification, and fault location respectively.

translated by 谷歌翻译

A 3D-Shape Similarity-based Contrastive Approach to Molecular Representation Learning

Austin Atsango , Nathaniel L. Diamant , Ziqing Lu , Tommaso Biancalani , Gabriele Scalia , Kangway V. Chuang

分类：机器学习

2022-11-03

Molecular shape and geometry dictate key biophysical recognition processes, yet many graph neural networks disregard 3D information for molecular property prediction. Here, we propose a new contrastive-learning procedure for graph neural networks, Molecular Contrastive Learning from Shape Similarity (MolCLaSS), that implicitly learns a three-dimensional representation. Rather than directly encoding or targeting three-dimensional poses, MolCLaSS matches a similarity objective based on Gaussian overlays to learn a meaningful representation of molecular shape. We demonstrate how this framework naturally captures key aspects of three-dimensionality that two-dimensional representations cannot and provides an inductive framework for scaffold hopping.

translated by 谷歌翻译