对于在城市环境中导航的自主机器人,对于机器人而言,要保持在指定的旅行路径(即小径),并避免使用诸如草和花园床之类的区域,以确保安全和社会符合性考虑因素。本文为未知的城市环境提供了一种自主导航方法,该方法结合了语义分割和激光雷达数据的使用。所提出的方法使用分段的图像掩码创建环境的3D障碍物图,从中计算了人行道的边界。与现有方法相比,我们的方法不需要预先建造的地图,并提供了对安全区域的3D理解,从而使机器人能够计划通过人行道的任何路径。将我们的方法与仅使用LiDAR或仅使用语义分割的两种替代方案进行比较的实验表明,总体而言,我们所提出的方法在户外的成功率大于91%的成功率,并且在室内大于66%。我们的方法使机器人始终保持在安全的旅行道路上,并减少了碰撞数量。
translated by 谷歌翻译
当人类共同完成联合任务时,每个人都会建立一个情况的内部模型以及如何发展。有效的协作取决于这些单个模型如何重叠以在团队成员之间形成共同的心理模型,这对于人类机器人团队中的协作流程很重要。准确的共享心理模型的发展和维护需要个人意图的双向交流以及解释其他团队成员意图的能力。为了实现有效的人类机器人协作,本文介绍了人类机器人团队合作中新型联合行动框架的设计和实施,利用增强现实(AR)技术和用户眼目光来实现意图的双向交流。我们通过与37名参与者的用户研究测试了我们的新框架,发现我们的系统提高了任务效率,信任和任务流利。因此,使用AR和眼睛凝视使双向交流是一种有前途的平均值,可以改善影响人与机器人之间协作的核心组成部分。
translated by 谷歌翻译
人类在交流何时和何时发生的何时和何处的意图方面非常熟练。但是,即使是最先进的机器人实现,通常缺乏这种交流技巧。这项研究调查了使用增强现实的机器人内部状态的可视化和对人向机器人移交的意图。具体而言,我们探讨了对象和机器人抓手的可视化3D模型的使用,以传达机器人对物体所在位置的估计以及机器人打算掌握对象的姿势。我们通过16名参与者的用户研究测试了这一设计,其中每个参与者将一个立方体对象交给机器人12次。结果表明,通过增强现实的通信机器人意图基本上改善了用户对移交的感知体验。结果还表明,当机器人在定位对象时犯错时,增强现实的有效性对于相互作用的安全性和交互的流利性更加明显。
translated by 谷歌翻译
本文对人机对象切换的文献进行了调查。切换是一种协作的关节动作,其中代理人,给予者,给予对象给另一代理,接收器。当接收器首先与给予者持有的对象并结束时,当给予者完全将物体释放到接收器时,物理交换开始。然而,重要的认知和物理过程在物理交换之前开始,包括在交换的位置和时间内启动隐含协议。从这个角度来看,我们将审核构成了上述事件界定的两个主要阶段:1)预切换阶段和2)物理交流。我们专注于两位演员(Giver和Receiver)的分析,并报告机器人推动者(机器人到人类切换)和机器人接收器(人到机器人切换)的状态。我们举报了常用于评估互动的全面的定性和定量度量列表。虽然将我们的认知水平(例如,预测,感知,运动规划,学习)和物理水平(例如,运动,抓握,抓取释放)的审查重点,但我们简要讨论了安全的概念,社会背景,和人体工程学。我们将在人对人物助手中显示的行为与机器人助手的最新进行比较,并确定机器人助剂的主要改善领域,以达到与人类相互作用相当的性能。最后,我们提出了一种应使用的最小度量标准,以便在方法之间进行公平比较。
translated by 谷歌翻译
Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.
translated by 谷歌翻译
Brain midline shift (MLS) is one of the most critical factors to be considered for clinical diagnosis and treatment decision-making for intracranial hemorrhage. Existing computational methods on MLS quantification not only require intensive labeling in millimeter-level measurement but also suffer from poor performance due to their dependence on specific landmarks or simplified anatomical assumptions. In this paper, we propose a novel semi-supervised framework to accurately measure the scale of MLS from head CT scans. We formulate the MLS measurement task as a deformation estimation problem and solve it using a few MLS slices with sparse labels. Meanwhile, with the help of diffusion models, we are able to use a great number of unlabeled MLS data and 2793 non-MLS cases for representation learning and regularization. The extracted representation reflects how the image is different from a non-MLS image and regularization serves an important role in the sparse-to-dense refinement of the deformation field. Our experiment on a real clinical brain hemorrhage dataset has achieved state-of-the-art performance and can generate interpretable deformation fields.
translated by 谷歌翻译
Adversarial imitation learning (AIL) has become a popular alternative to supervised imitation learning that reduces the distribution shift suffered by the latter. However, AIL requires effective exploration during an online reinforcement learning phase. In this work, we show that the standard, naive approach to exploration can manifest as a suboptimal local maximum if a policy learned with AIL sufficiently matches the expert distribution without fully learning the desired task. This can be particularly catastrophic for manipulation tasks, where the difference between an expert and a non-expert state-action pair is often subtle. We present Learning from Guided Play (LfGP), a framework in which we leverage expert demonstrations of multiple exploratory, auxiliary tasks in addition to a main task. The addition of these auxiliary tasks forces the agent to explore states and actions that standard AIL may learn to ignore. Additionally, this particular formulation allows for the reusability of expert data between main tasks. Our experimental results in a challenging multitask robotic manipulation domain indicate that LfGP significantly outperforms both AIL and behaviour cloning, while also being more expert sample efficient than these baselines. To explain this performance gap, we provide further analysis of a toy problem that highlights the coupling between a local maximum and poor exploration, and also visualize the differences between the learned models from AIL and LfGP.
translated by 谷歌翻译
In this work, we introduce a hypergraph representation learning framework called Hypergraph Neural Networks (HNN) that jointly learns hyperedge embeddings along with a set of hyperedge-dependent embeddings for each node in the hypergraph. HNN derives multiple embeddings per node in the hypergraph where each embedding for a node is dependent on a specific hyperedge of that node. Notably, HNN is accurate, data-efficient, flexible with many interchangeable components, and useful for a wide range of hypergraph learning tasks. We evaluate the effectiveness of the HNN framework for hyperedge prediction and hypergraph node classification. We find that HNN achieves an overall mean gain of 7.72% and 11.37% across all baseline models and graphs for hyperedge prediction and hypergraph node classification, respectively.
translated by 谷歌翻译
Neural fields, also known as coordinate-based or implicit neural representations, have shown a remarkable capability of representing, generating, and manipulating various forms of signals. For video representations, however, mapping pixel-wise coordinates to RGB colors has shown relatively low compression performance and slow convergence and inference speed. Frame-wise video representation, which maps a temporal coordinate to its entire frame, has recently emerged as an alternative method to represent videos, improving compression rates and encoding speed. While promising, it has still failed to reach the performance of state-of-the-art video compression algorithms. In this work, we propose FFNeRV, a novel method for incorporating flow information into frame-wise representations to exploit the temporal redundancy across the frames in videos inspired by the standard video codecs. Furthermore, we introduce a fully convolutional architecture, enabled by one-dimensional temporal grids, improving the continuity of spatial features. Experimental results show that FFNeRV yields the best performance for video compression and frame interpolation among the methods using frame-wise representations or neural fields. To reduce the model size even further, we devise a more compact convolutional architecture using the group and pointwise convolutions. With model compression techniques, including quantization-aware training and entropy coding, FFNeRV outperforms widely-used standard video codecs (H.264 and HEVC) and performs on par with state-of-the-art video compression algorithms.
translated by 谷歌翻译
Learning fair graph representations for downstream applications is becoming increasingly important, but existing work has mostly focused on improving fairness at the global level by either modifying the graph structure or objective function without taking into account the local neighborhood of a node. In this work, we formally introduce the notion of neighborhood fairness and develop a computational framework for learning such locally fair embeddings. We argue that the notion of neighborhood fairness is more appropriate since GNN-based models operate at the local neighborhood level of a node. Our neighborhood fairness framework has two main components that are flexible for learning fair graph representations from arbitrary data: the first aims to construct fair neighborhoods for any arbitrary node in a graph and the second enables adaption of these fair neighborhoods to better capture certain application or data-dependent constraints, such as allowing neighborhoods to be more biased towards certain attributes or neighbors in the graph.Furthermore, while link prediction has been extensively studied, we are the first to investigate the graph representation learning task of fair link classification. We demonstrate the effectiveness of the proposed neighborhood fairness framework for a variety of graph machine learning tasks including fair link prediction, link classification, and learning fair graph embeddings. Notably, our approach achieves not only better fairness but also increases the accuracy in the majority of cases across a wide variety of graphs, problem settings, and metrics.
translated by 谷歌翻译