数据的表示对于机器学习方法至关重要。内核方法用于丰富特征表示,从而可以更好地概括。量子内核有效地实施了在量子系统的希尔伯特空间中编码经典数据的有效复杂的转换,甚至导致指数加速。但是,我们需要对数据的先验知识来选择可以用作量子嵌入的适当参数量子电路。我们提出了一种算法,该算法通过组合优化过程自动选择最佳的量子嵌入过程,该过程修改了电路的结构,更改门的发生器,其角度(取决于数据点)以及各种门的QUBIT行为。由于组合优化在计算上是昂贵的,因此我们基于均值周围的核基质系数的指数浓度引入了一个标准,以立即丢弃任意大部分的溶液,这些溶液被认为性能较差。与基于梯度的优化(例如可训练的量子内核)相反,我们的方法不受建筑贫瘠的高原影响。我们已经使用人工和现实数据集来证明相对于随机生成的PQC的方法的提高。我们还比较了不同优化算法的效果,包括贪婪的局部搜索,模拟退火和遗传算法,表明算法选择在很大程度上影响了结果。
translated by 谷歌翻译
单像超分辨率可以在需要可靠的视觉流以监视任务,处理远程操作或研究相关视觉细节的环境中支持机器人任务。在这项工作中,我们为实时超级分辨率提出了一个有效的生成对抗网络模型。我们采用了原始SRGAN的量身定制体系结构和模型量化,以提高CPU和Edge TPU设备上的执行,最多达到200 fps的推断。我们通过将其知识提炼成较小版本的网络,进一步优化我们的模型,并与标准培训方法相比获得显着的改进。我们的实验表明,与较重的最新模型相比,我们的快速和轻量级模型可保持相当令人满意的图像质量。最后,我们对图像传输进行带宽降解的实验,以突出提出的移动机器人应用系统的优势。
translated by 谷歌翻译
域的概括(DG)研究了深度学习模型推广到训练分布的能力。在过去的十年中,文献已经大量填充了一系列培训方法,这些方法声称获得了更抽象和强大的数据表示以应对域的转移。最近的研究为DG提供了可再现的基准,指出了天真的经验风险最小化(ERM)对现有算法的有效性。然而,研究人员坚持使用相同过时的特征提取器,并且尚未注意不同骨干的影响。在本文中,我们从骨干开始,提出了对其内在概括能力的全面分析,迄今为止,研究界忽略了。我们评估了各种特征提取器,从标准残差解决方案到基于变压器的架构,发现大规模单域分类精度和DG功能之间的线性相关性。我们广泛的实验表明,通过采用竞争性骨干与有效的数据增强结合使用,普通ERM的表现优于最近的DG解决方案,并实现了最先进的准确性。此外,我们的其他定性研究表明,新型骨架提供了与同类样本更相似的表示,从而将特征空间中的不同域分开。这种概括能力的增强功能使DG算法的边缘空间为调查问题,提出了一个新的范式,将骨干放在聚光灯下,并鼓励在其顶部开发一致的算法。
translated by 谷歌翻译
精确农业正在迅速吸引研究,以有效地引入自动化和机器人解决方案,以支持农业活动。葡萄园和果园中的机器人导航在自主监控方面具有竞争优势,并轻松获取农作物来收集,喷涂和执行时必的耗时必要任务。如今,自主导航算法利用了昂贵的传感器,这也需要大量的数据处理计算成本。尽管如此,葡萄园行代表了一个具有挑战性的户外场景,在这种情况下,GPS和视觉进程技术通常难以提供可靠的定位信息。在这项工作中,我们将Edge AI与深度强化学习相结合,以提出一种尖端的轻质解决方案,以解决自主葡萄园导航的问题,而无需利用精确的本地化数据并通过基于灵活的学习方法来克服任务列出的算法。我们训练端到端的感觉运动剂,该端机直接映射嘈杂的深度图像和位置不可稳定的机器人状态信息到速度命令,并将机器人引导到一排的尽头,不断调整其标题以进行无碰撞的无碰撞中央轨迹。我们在现实的模拟葡萄园中进行的广泛实验证明了解决方案的有效性和代理的概括能力。
translated by 谷歌翻译
精确农业的发展在农业过程中逐渐引入自动化,以支持和合理化与现场管理有关的所有活动。特别是,服务机器人技术通过部署能够在字段中导航的自主代理在执行不同的任务而无需人工干预(例如监视,喷涂和收获)的同时,在这一演变中起主要作用。在这种情况下,全球路径规划是每个机器人任务的第一步,并确保通过完整的现场覆盖范围有效地执行导航。在本文中,我们提出了一种基于学习的方法来解决Waypoint生成,以规划基于行的农作物的导航路径,从利益区域的顶级图表开始。我们提出了一种基于对比损失的新方法,可以将这些点投射到可分离的潜在空间。拟议的深神经网络可以同时在单个正向传球中使用两个专门的头部来预测路点位置和群集分配。对模拟和现实世界图像的广泛实验表明,所提出的方法有效地解决了基于直的和曲面的作物的路点生成问题,从而克服了先前最先进的方法的局限性。
translated by 谷歌翻译
Continual Learning (CL) is a field dedicated to devise algorithms able to achieve lifelong learning. Overcoming the knowledge disruption of previously acquired concepts, a drawback affecting deep learning models and that goes by the name of catastrophic forgetting, is a hard challenge. Currently, deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions, but whenever we expose such systems to this incremental setting, performance drop very quickly. Overcoming this limitation is fundamental as it would allow us to build truly intelligent systems showing stability and plasticity. Secondly, it would allow us to overcome the onerous limitation of retraining these architectures from scratch with the new updated data. In this thesis, we tackle the problem from multiple directions. In a first study, we show that in rehearsal-based techniques (systems that use memory buffer), the quantity of data stored in the rehearsal buffer is a more important factor over the quality of the data. Secondly, we propose one of the early works of incremental learning on ViTs architectures, comparing functional, weight and attention regularization approaches and propose effective novel a novel asymmetric loss. At the end we conclude with a study on pretraining and how it affects the performance in Continual Learning, raising some questions about the effective progression of the field. We then conclude with some future directions and closing remarks.
translated by 谷歌翻译
Computational units in artificial neural networks follow a simplified model of biological neurons. In the biological model, the output signal of a neuron runs down the axon, splits following the many branches at its end, and passes identically to all the downward neurons of the network. Each of the downward neurons will use their copy of this signal as one of many inputs dendrites, integrate them all and fire an output, if above some threshold. In the artificial neural network, this translates to the fact that the nonlinear filtering of the signal is performed in the upward neuron, meaning that in practice the same activation is shared between all the downward neurons that use that signal as their input. Dendrites thus play a passive role. We propose a slightly more complex model for the biological neuron, where dendrites play an active role: the activation in the output of the upward neuron becomes optional, and instead the signals going through each dendrite undergo independent nonlinear filterings, before the linear combination. We implement this new model into a ReLU computational unit and discuss its biological plausibility. We compare this new computational unit with the standard one and describe it from a geometrical point of view. We provide a Keras implementation of this unit into fully connected and convolutional layers and estimate their FLOPs and weights change. We then use these layers in ResNet architectures on CIFAR-10, CIFAR-100, Imagenette, and Imagewoof, obtaining performance improvements over standard ResNets up to 1.73%. Finally, we prove a universal representation theorem for continuous functions on compact sets and show that this new unit has more representational power than its standard counterpart.
translated by 谷歌翻译
Detecting anomalous data within time series is a very relevant task in pattern recognition and machine learning, with many possible applications that range from disease prevention in medicine, e.g., detecting early alterations of the health status before it can clearly be defined as "illness" up to monitoring industrial plants. Regarding this latter application, detecting anomalies in an industrial plant's status firstly prevents serious damages that would require a long interruption of the production process. Secondly, it permits optimal scheduling of maintenance interventions by limiting them to urgent situations. At the same time, they typically follow a fixed prudential schedule according to which components are substituted well before the end of their expected lifetime. This paper describes a case study regarding the monitoring of the status of Laser-guided Vehicles (LGVs) batteries, on which we worked as our contribution to project SUPER (Supercomputing Unified Platform, Emilia Romagna) aimed at establishing and demonstrating a regional High-Performance Computing platform that is going to represent the main Italian supercomputing environment for both computing power and data volume.
translated by 谷歌翻译
Recent object detection models for infrared (IR) imagery are based upon deep neural networks (DNNs) and require large amounts of labeled training imagery. However, publicly-available datasets that can be used for such training are limited in their size and diversity. To address this problem, we explore cross-modal style transfer (CMST) to leverage large and diverse color imagery datasets so that they can be used to train DNN-based IR image based object detectors. We evaluate six contemporary stylization methods on four publicly-available IR datasets - the first comparison of its kind - and find that CMST is highly effective for DNN-based detectors. Surprisingly, we find that existing data-driven methods are outperformed by a simple grayscale stylization (an average of the color channels). Our analysis reveals that existing data-driven methods are either too simplistic or introduce significant artifacts into the imagery. To overcome these limitations, we propose meta-learning style transfer (MLST), which learns a stylization by composing and tuning well-behaved analytic functions. We find that MLST leads to more complex stylizations without introducing significant image artifacts and achieves the best overall detector performance on our benchmark datasets.
translated by 谷歌翻译
Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based classification to improve bladder tissue classification when annotations are limited in multi-domain data.
translated by 谷歌翻译