为了更好地优化全球食品供应链,需要机器人解决方案来自动化人类目前完成的任务。也就是说,表型,质量分析和收获都是农业机器人技术领域的空旷问题。机器人感知是自治解决方案的关键挑战,例如场景理解和对象检测是机器人可能承担的任何掌握任务的重要先决条件。这项工作对现代机器人感知模型进行了简要审查,并讨论了它们在农业食品领域内的功效。
translated by 谷歌翻译
在最近的过去,草莓的机器人收获引起了很多兴趣。尽管有很多创新,但它们尚未达到与人类采摘专家相当的水平。末端效应单元在定义这种机器人收割系统的效率方面起着重要作用。即使有关于草莓收集的各种最终效应子的报道,但是在某些情况下,研究人员可以依靠某些参数来开发新的最终效应子。这些参数包括可以在花梗上应用的抓地力极限,以有效地抓握,切割草莓花梗所需的力等。这些估计将对目标的最终效应器的设计周期有所帮助,以握住和切割在收获动作期间,草莓花梗。本文通过实验研究了这些参数的估计和分析。据估计,花梗的握力可以限制为10N。这使最终效应器能够抓住高达50克的草莓,而操纵加速度为50 m/s $^2 $,而不会挤压花梗。关于花梗切割力的研究表明,15 n的力足以在30度方向上使用楔形角度为16.6度的刀片切出草莓花梗。
translated by 谷歌翻译
本文提出了一种在机器人操纵运动中处理物体滑移的新型控制方法。滑移是许多机器人抓握和操纵任务中失败的主要原因。现有工程增加了抓地力以避免/控制滑移。但是,当(i)机器人无法增加抓地力时,这可能是不可行的 - 最大抓地力已被施加或(ii)增加的力损坏了抓地物物体,例如软果。此外,机器人在物体表面形成稳定的掌握时固定了握力,并且在实时操作过程中更改握紧力可能不是有效的控制政策。我们提出了一种新颖的控制方法,以避免滑移,包括学到的动作条件的滑移预测指标和受约束的优化器,避免了预测的机器人动作。我们通过一系列真实机器人测试案例显示了拟议的轨迹适应方法的有效性。我们的实验结果表明,我们提出的数据驱动的预测控制器可以控制训练中看不见的物体的滑动。
translated by 谷歌翻译
本文提出了一种从示威(LFD)中进行深度机器人学习的新型概率方法。深度运动原语(DMP)是确定性的LFD模型,可直接将视觉信息映射到机器人轨迹中。本文扩展了DMP,并提出了一个深层概率模型,该模型将视觉信息映射到有效的机器人轨迹的分布中。提出了导致轨迹精度最高水平的结构,并与现有方法进行了比较。此外,本文介绍了一种用于学习域特异性潜在特征的新型培训方法。我们展示了在实验室的草莓收集任务中提出的概率方法和新颖的潜在空间学习的优势。实验结果表明,潜在空间学习可以显着改善模型预测性能。提出的方法允许从分布中采样轨迹并优化机器人轨迹以满足次级目标,例如避免碰撞。
translated by 谷歌翻译
我们介绍了Dreamento(Drea Engineering Toolbox),这是一种使用Zmax(Hypnodyne Corp.,Sofia,Bulgaria,Bulgaria)的开源Python Python套件,用于梦想工程。 DREAMENTO的主要功能是(1)图形用户界面(GUI)(2)(2)中的实时记录,监视,分析和刺激以及所得数据的离线后处理。 Dreamento实时能够(1)记录数据,(2)可视化数据,包括功率光谱分析和导航,(3)自动睡眠评分,(4)感觉刺激(视觉,听觉,触觉), (5)建立文本到语音通信,以及(6)管理自动和手动事件的注释。离线功能有助于其后处理获得的数据,具有重新装修可穿戴数据并将其与不可磨损记录的模式(例如肌电图)相结合的功能。尽管Dreamento的主要应用是用于(清醒)梦想研究的,但它愿意适应其他目的和测量方式。
translated by 谷歌翻译
With the advent of Neural Style Transfer (NST), stylizing an image has become quite popular. A convenient way for extending stylization techniques to videos is by applying them on a per-frame basis. However, such per-frame application usually lacks temporal-consistency expressed by undesirable flickering artifacts. Most of the existing approaches for enforcing temporal-consistency suffers from one or more of the following drawbacks. They (1) are only suitable for a limited range of stylization techniques, (2) can only be applied in an offline fashion requiring the complete video as input, (3) cannot provide consistency for the task of stylization, or (4) do not provide interactive consistency-control. Note that existing consistent video-filtering approaches aim to completely remove flickering artifacts and thus do not respect any specific consistency-control aspect. For stylization tasks, however, consistency-control is an essential requirement where a certain amount of flickering can add to the artistic look and feel. Moreover, making this control interactive is paramount from a usability perspective. To achieve the above requirements, we propose an approach that can stylize video streams while providing interactive consistency-control. Apart from stylization, our approach also supports various other image processing filters. For achieving interactive performance, we develop a lite optical-flow network that operates at 80 Frames per second (FPS) on desktop systems with sufficient accuracy. We show that the final consistent video-output using our flow network is comparable to that being obtained using state-of-the-art optical-flow network. Further, we employ an adaptive combination of local and global consistent features and enable interactive selection between the two. By objective and subjective evaluation, we show that our method is superior to state-of-the-art approaches.
translated by 谷歌翻译
Machine learning is the dominant approach to artificial intelligence, through which computers learn from data and experience. In the framework of supervised learning, for a computer to learn from data accurately and efficiently, some auxiliary information about the data distribution and target function should be provided to it through the learning model. This notion of auxiliary information relates to the concept of regularization in statistical learning theory. A common feature among real-world datasets is that data domains are multiscale and target functions are well-behaved and smooth. In this paper, we propose a learning model that exploits this multiscale data structure and discuss its statistical and computational benefits. The hierarchical learning model is inspired by the logical and progressive easy-to-hard learning mechanism of human beings and has interpretable levels. The model apportions computational resources according to the complexity of data instances and target functions. This property can have multiple benefits, including higher inference speed and computational savings in training a model for many users or when training is interrupted. We provide a statistical analysis of the learning mechanism using multiscale entropies and show that it can yield significantly stronger guarantees than uniform convergence bounds.
translated by 谷歌翻译
Transformer language models (TLMs) are critical for most NLP tasks, but they are difficult to create for low-resource languages because of how much pretraining data they require. In this work, we investigate two techniques for training monolingual TLMs in a low-resource setting: greatly reducing TLM size, and complementing the masked language modeling objective with two linguistically rich supervised tasks (part-of-speech tagging and dependency parsing). Results from 7 diverse languages indicate that our model, MicroBERT, is able to produce marked improvements in downstream task evaluations relative to a typical monolingual TLM pretraining approach. Specifically, we find that monolingual MicroBERT models achieve gains of up to 18% for parser LAS and 11% for NER F1 compared to a multilingual baseline, mBERT, while having less than 1% of its parameter count. We conclude reducing TLM parameter count and using labeled data for pretraining low-resource TLMs can yield large quality benefits and in some cases produce models that outperform multilingual approaches.
translated by 谷歌翻译
Practical applications of mechanical metamaterials often involve solving inverse problems where the objective is to find the (multiple) microarchitectures that give rise to a given set of properties. The limited resolution of additive manufacturing techniques often requires solving such inverse problems for specific sizes. One should, therefore, find multiple microarchitectural designs that exhibit the desired properties for a specimen with given dimensions. Moreover, the candidate microarchitectures should be resistant to fatigue and fracture, meaning that peak stresses should be minimized as well. Such a multi-objective inverse design problem is formidably difficult to solve but its solution is the key to real-world applications of mechanical metamaterials. Here, we propose a modular approach titled 'Deep-DRAM' that combines four decoupled models, including two deep learning models (DLM), a deep generative model (DGM) based on conditional variational autoencoders (CVAE), and direct finite element (FE) simulations. Deep-DRAM (deep learning for the design of random-network metamaterials) integrates these models into a unified framework capable of finding many solutions to the multi-objective inverse design problem posed here. The integrated framework first introduces the desired elastic properties to the DGM, which returns a set of candidate designs. The candidate designs, together with the target specimen dimensions are then passed to the DLM which predicts their actual elastic properties considering the specimen size. After a filtering step based on the closeness of the actual properties to the desired ones, the last step uses direct FE simulations to identify the designs with the minimum peak stresses.
translated by 谷歌翻译
Dual encoders are now the dominant architecture for dense retrieval. Yet, we have little understanding of how they represent text, and why this leads to good performance. In this work, we shed light on this question via distributions over the vocabulary. We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space. We show that the resulting distributions over vocabulary tokens are intuitive and contain rich semantic information. We find that this view can explain some of the failure cases of dense retrievers. For example, the inability of models to handle tail entities can be explained via a tendency of the token distributions to forget some of the tokens of those entities. We leverage this insight and propose a simple way to enrich query and passage representations with lexical information at inference time, and show that this significantly improves performance compared to the original model in out-of-domain settings.
translated by 谷歌翻译