智能论文笔记

RISCuer: A Reliable Multi-UAV Search and Rescue Testbed

Mohamed Abdelkader , Usman A. Fiaz , Noureddine Toumi , Mohamed A. Mabrok , Jeff S. Shamma

分类：机器人

2020-06-12

我们提供了机器人智能系统和控制（RISC）LAB MULTIAGEGGENT测试，用于在室外环境中的可靠搜索和救援和空中运输。该系统包括三个多陆无人机（无人机）的团队，能够在室外场中自主搜索，拾取和运输随机分布的物体。该方法涉及基于视觉的物体检测和定位，具有我们的新颖设计，基于GPS的UAV导航和下降区的物体的安全释放。我们的合作策略可确保无人机之间安全的空间分离，我们可以使用已启用的通信共识，防止下落区域的冲突。所有计算都在每个UAV上执行。我们描述了系统的完整软件和硬件架构，并使用全面的户外实验展示其可靠的性能，并通过将我们的结果与最近的一些类似的作品进行比较。

translated by 谷歌翻译

A Comprehensive Review on Autonomous Navigation

Saeid Nahavandi , Roohallah Alizadehsani , Darius Nahavandi , Shady Mohamed , Navid Mohajer , Mohammad Rokonuzzaman , Ibrahim Hossain

分类：机器人

2022-12-24

The field of autonomous mobile robots has undergone dramatic advancements over the past decades. Despite achieving important milestones, several challenges are yet to be addressed. Aggregating the achievements of the robotic community as survey papers is vital to keep the track of current state-of-the-art and the challenges that must be tackled in the future. This paper tries to provide a comprehensive review of autonomous mobile robots covering topics such as sensor types, mobile robot platforms, simulation tools, path planning and following, sensor fusion methods, obstacle avoidance, and SLAM. The urge to present a survey paper is twofold. First, autonomous navigation field evolves fast so writing survey papers regularly is crucial to keep the research community well-aware of the current status of this field. Second, deep learning methods have revolutionized many fields including autonomous navigation. Therefore, it is necessary to give an appropriate treatment of the role of deep learning in autonomous navigation as well which is covered in this paper. Future works and research gaps will also be discussed.

translated by 谷歌翻译

Hardware Acceleration of Lane Detection Algorithm: A GPU Versus FPGA Comparison

Mohamed Alshemi , Sherif Saif , Mohamed Taher

分类：计算机视觉

2022-12-19

A Complete Computer vision system can be divided into two main categories: detection and classification. The Lane detection algorithm is a part of the computer vision detection category and has been applied in autonomous driving and smart vehicle systems. The lane detection system is responsible for lane marking in a complex road environment. At the same time, lane detection plays a crucial role in the warning system for a car when departs the lane. The implemented lane detection algorithm is mainly divided into two steps: edge detection and line detection. In this paper, we will compare the state-of-the-art implementation performance obtained with both FPGA and GPU to evaluate the trade-off for latency, power consumption, and utilization. Our comparison emphasises the advantages and disadvantages of the two systems.

translated by 谷歌翻译

Multimodal CNN Networks for Brain Tumor Segmentation in MRI: A BraTS 2022 Challenge Solution

Ramy A. Zeineldin , Mohamed E. Karar , Oliver Burgert , Franziska Mathis-Ullrich

分类：计算机视觉 | 机器学习

2022-12-19

Automatic segmentation is essential for the brain tumor diagnosis, disease prognosis, and follow-up therapy of patients with gliomas. Still, accurate detection of gliomas and their sub-regions in multimodal MRI is very challenging due to the variety of scanners and imaging protocols. Over the last years, the BraTS Challenge has provided a large number of multi-institutional MRI scans as a benchmark for glioma segmentation algorithms. This paper describes our contribution to the BraTS 2022 Continuous Evaluation challenge. We propose a new ensemble of multiple deep learning frameworks namely, DeepSeg, nnU-Net, and DeepSCAN for automatic glioma boundaries detection in pre-operative MRI. It is worth noting that our ensemble models took first place in the final evaluation on the BraTS testing dataset with Dice scores of 0.9294, 0.8788, and 0.8803, and Hausdorf distance of 5.23, 13.54, and 12.05, for the whole tumor, tumor core, and enhancing tumor, respectively. Furthermore, the proposed ensemble method ranked first in the final ranking on another unseen test dataset, namely Sub-Saharan Africa dataset, achieving mean Dice scores of 0.9737, 0.9593, and 0.9022, and HD95 of 2.66, 1.72, 3.32 for the whole tumor, tumor core, and enhancing tumor, respectively. The docker image for the winning submission is publicly available at (https://hub.docker.com/r/razeineldin/camed22).

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Image augmentation with conformal mappings for a convolutional neural network

Oona Rainio , Mohamed M. S. Nasser , Matti Vuorinen , Riku Klén

分类：计算机视觉

2022-12-10

For augmentation of the square-shaped image data of a convolutional neural network (CNN), we introduce a new method, in which the original images are mapped onto a disk with a conformal mapping, rotated around the center of this disk and mapped under such a M\"obius transformation that preserves the disk, and then mapped back onto their original square shape. This process does not result the loss of information caused by removing areas from near the edges of the original images unlike the typical transformations used in the data augmentation for a CNN. We offer here the formulas of all the mappings needed together with detailed instructions how to write a code for transforming the images. The new method is also tested with simulated data and, according the results, using this method to augment the training data of 10 images into 40 images decreases the amount of the error in the predictions by a CNN for a test set of 160 images in a statistically significant way (p-value=0.0360).

translated by 谷歌翻译

Towards Next Generation of Pedestrian and Connected Vehicle In-the-loop Research: A Digital Twin Simulation Framework

Zijin Wang , Ou Zheng , Liangding Li , Mohamed Abdel-Aty , Carolina Cruz-Neira , Zubayer Islam

分类：机器人

2022-12-08

Digital Twin is an emerging technology that replicates real-world entities into a digital space. It has attracted increasing attention in the transportation field and many researchers are exploring its future applications in the development of Intelligent Transportation System (ITS) technologies. Connected vehicles (CVs) and pedestrians are among the major traffic participants in ITS. However, the usage of Digital Twin in research involving both CV and pedestrian remains largely unexplored. In this study, a Digital Twin framework for CV and pedestrian in-the-loop simulation is proposed. The proposed framework consists of the physical world, the digital world, and data transmission in between. The features for the entities (CV and pedestrian) that need digital twined are divided into external state and internal state, and the attributes in each state are described. We also demonstrate a sample architecture under the proposed Digital Twin framework, which is based on Carla-Sumo Co-simulation and Cave automatic virtual environment (CAVE). The proposed framework is expected to provide guidance to the future Digital Twin research, and the architecture we build can serve as the testbed for further research and development of ITS applications on CV and pedestrian.

translated by 谷歌翻译

A domain adaptive deep learning solution for scanpath prediction of paintings

Mohamed Amine Kerkouri , Marouane Tliba , Aladine Chetouani , Alessandro Bruno

分类：计算机视觉

2022-09-22

文化遗产的理解和保存对于社会来说是一个重要的问题，因为它代表了其身份的基本方面。绘画代表了文化遗产的重要组成部分，并且是不断研究的主题。但是，观众认为绘画与所谓的HVS（人类视觉系统）行为严格相关。本文重点介绍了一定数量绘画的视觉体验期间观众的眼动分析。在进一步的详细信息中，我们引入了一种新的方法来预测人类的视觉关注，这影响了人类的几种认知功能，包括对场景的基本理解，然后将其扩展到绘画图像。拟议的新建筑摄入图像并返回扫描路径，这是一系列积分，具有引起观众注意力的很有可能性。我们使用FCNN（完全卷积的神经网络），其中利用了可区分的渠道选择和软弧度模块。我们还将可学习的高斯分布纳入网络瓶颈上，以模拟自然场景图像中的视觉注意力过程偏见。此外，为了减少不同域之间的变化影响（即自然图像，绘画），我们敦促模型使用梯度反转分类器从其他域中学习无监督的一般特征。在准确性和效率方面，我们的模型获得的结果优于现有的最先进的结果。

translated by 谷歌翻译

A Few Shot Multi-Representation Approach for N-gram Spotting in Historical Manuscripts

Giuseppe De Gregorio , Sanket Biswas , Mohamed Ali Souibgui , Asma Bensalah , Josep Lladós , Alicia Fornés , Angelo Marcelli

分类：计算机视觉

2022-09-21

尽管最近的自动文本识别取得了进步，但在历史手稿方面，该性能仍然保持温和。这主要是因为缺乏可用的标记数据来训练渴望数据的手写文本识别（HTR）模型。由于错误率的降低，关键字发现系统（KWS）提供了HTR的有效替代方案，但通常仅限于封闭的参考词汇。在本文中，我们提出了一些学习范式，用于发现几个字符（n-gram）的序列，这些序列需要少量标记的训练数据。我们表明，对重要的n-gram的认识可以减少系统对词汇的依赖。在这种情况下，输入手写线图像中的vocabulary（OOV）单词可能是属于词典的n-gram序列。对我们提出的多代表方法进行了广泛的实验评估。

translated by 谷歌翻译

CLIO: a Novel Robotic Solution for Exploration and Rescue Missions in Hostile Mountain Environments

Michele Focchi , Mohamed Bensaadallah , Marco Frego , Angelika Peer , Daniele Fontanelli , Andrea Del Prete , Luigi Palopoli

分类：机器人

2022-09-20

由于有限的有效载荷能力有限，因此在山区环境中的救援任务几乎无法通过标准的腿部机器人或飞行机器人来实现。我们提出了一个新颖的概念，用于绳索攀岩机器人，该机器人可以谈判最新的斜坡并承担重载的有效载荷。机器人通过绳子固定在山上，并配备了一条腿来推向山上并开始跳跃动作。在跳跃之间，提升机被用来绕/放开绳索，以垂直移动并影响横向运动。这种简单的（但有效）的两倍致动，使系统能够实现高安全性和能源效率。确实，绳索可以防止机器人掉落，同时弥补了大部分重量，从而大大减少了腿部执行器所需的努力。我们还提出了一种最佳控制策略，以生成克服障碍的点对点轨迹。由于使用了自定义简化的机器人模型，我们可以实现快速计算时间（$ <$ 1 s）。我们使用完整的机器人模型验证了凉亭模拟中生成的最佳运动，显示了提出的方法的有效性，并确认了我们概念的兴趣。最后，我们进行了可及性分析，表明可实现的目标区域受到脚壁接触的摩擦特性的强烈影响。

translated by 谷歌翻译