安全是空中交通时的主要问题。通过成对分离最小值确保无人驾驶飞机(无人机)之间的飞行安全性,利用冲突检测和分辨方法。现有方法主要处理成对冲突,但由于交通密度的预期增加,可能会发生两个以上的无人机的遇到。在本文中,我们将多UAV冲突解决模型作为多功能加强学习问题。我们实现了一种基于图形神经网络的算法,配合代理可以与共同生成分辨率的操作进行通信。该模型在具有3和4个当前代理的情况下进行评估。结果表明,代理商能够通过合作策略成功解决多UV冲突。
translated by 谷歌翻译
Network models are an essential block of modern networks. For example, they are widely used in network planning and optimization. However, as networks increase in scale and complexity, some models present limitations, such as the assumption of markovian traffic in queuing theory models, or the high computational cost of network simulators. Recent advances in machine learning, such as Graph Neural Networks (GNN), are enabling a new generation of network models that are data-driven and can learn complex non-linear behaviors. In this paper, we present RouteNet-Fermi, a custom GNN model that shares the same goals as queuing theory, while being considerably more accurate in the presence of realistic traffic models. The proposed model predicts accurately the delay, jitter, and loss in networks. We have tested RouteNet-Fermi in networks of increasing size (up to 300 nodes), including samples with mixed traffic profiles -- e.g., with complex non-markovian models -- and arbitrary routing and queue scheduling configurations. Our experimental results show that RouteNet-Fermi achieves similar accuracy as computationally-expensive packet-level simulators and it is able to accurately scale to large networks. For example, the model produces delay estimates with a mean relative error of 6.24% when applied to a test dataset with 1,000 samples, including network topologies one order of magnitude larger than those seen during training.
translated by 谷歌翻译
Reinforcement learning is a machine learning approach based on behavioral psychology. It is focused on learning agents that can acquire knowledge and learn to carry out new tasks by interacting with the environment. However, a problem occurs when reinforcement learning is used in critical contexts where the users of the system need to have more information and reliability for the actions executed by an agent. In this regard, explainable reinforcement learning seeks to provide to an agent in training with methods in order to explain its behavior in such a way that users with no experience in machine learning could understand the agent's behavior. One of these is the memory-based explainable reinforcement learning method that is used to compute probabilities of success for each state-action pair using an episodic memory. In this work, we propose to make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks that need to be first addressed to solve a more complex task. The end goal is to verify if it is possible to provide to the agent the ability to explain its actions in the global task as well as in the sub-tasks. The results obtained showed that it is possible to use the memory-based method in hierarchical environments with high-level tasks and compute the probabilities of success to be used as a basis for explaining the agent's behavior.
translated by 谷歌翻译
The primary aim of this research was to find a model that best predicts which fallen angel bonds would either potentially rise up back to investment grade bonds and which ones would fall into bankruptcy. To implement the solution, we thought that the ideal method would be to create an optimal machine learning model that could predict bankruptcies. Among the many machine learning models out there we decided to pick four classification methods: logistic regression, KNN, SVM, and NN. We also utilized an automated methods of Google Cloud's machine learning. The results of our model comparisons showed that the models did not predict bankruptcies very well on the original data set with the exception of Google Cloud's machine learning having a high precision score. However, our over-sampled and feature selection data set did perform very well. This could likely be due to the model being over-fitted to match the narrative of the over-sampled data (as in, it does not accurately predict data outside of this data set quite well). Therefore, we were not able to create a model that we are confident that would predict bankruptcies. However, we were able to find value out of this project in two key ways. The first is that Google Cloud's machine learning model in every metric and in every data set either outperformed or performed on par with the other models. The second is that we found that utilizing feature selection did not reduce predictive power that much. This means that we can reduce the amount of data to collect for future experimentation regarding predicting bankruptcies.
translated by 谷歌翻译
Performing 3D dense captioning and visual grounding requires a common and shared understanding of the underlying multimodal relationships. However, despite some previous attempts on connecting these two related tasks with highly task-specific neural modules, it remains understudied how to explicitly depict their shared nature to learn them simultaneously. In this work, we propose UniT3D, a simple yet effective fully unified transformer-based architecture for jointly solving 3D visual grounding and dense captioning. UniT3D enables learning a strong multimodal representation across the two tasks through a supervised joint pre-training scheme with bidirectional and seq-to-seq objectives. With a generic architecture design, UniT3D allows expanding the pre-training scope to more various training sources such as the synthesized data from 2D prior knowledge to benefit 3D vision-language tasks. Extensive experiments and analysis demonstrate that UniT3D obtains significant gains for 3D dense captioning and visual grounding.
translated by 谷歌翻译
Learning how to navigate among humans in an occluded and spatially constrained indoor environment, is a key ability required to embodied agent to be integrated into our society. In this paper, we propose an end-to-end architecture that exploits Socially-Aware Tasks (referred as to Risk and Social Compass) to inject into a reinforcement learning navigation policy the ability to infer common-sense social behaviors. To this end, our tasks exploit the notion of immediate and future dangers of collision. Furthermore, we propose an evaluation protocol specifically designed for the Social Navigation Task in simulated environments. This is done to capture fine-grained features and characteristics of the policy by analyzing the minimal unit of human-robot spatial interaction, called Encounter. We validate our approach on Gibson4+ and Habitat-Matterport3D datasets.
translated by 谷歌翻译
We have developed a model for online continual or lifelong reinforcement learning (RL) inspired on the insect brain. Our model leverages the offline training of a feature extraction and a common general policy layer to enable the convergence of RL algorithms in online settings. Sharing a common policy layer across tasks leads to positive backward transfer, where the agent continuously improved in older tasks sharing the same underlying general policy. Biologically inspired restrictions to the agent's network are key for the convergence of RL algorithms. This provides a pathway towards efficient online RL in resource-constrained scenarios.
translated by 谷歌翻译
Label noise is a significant obstacle in deep learning model training. It can have a considerable impact on the performance of image classification models, particularly deep neural networks, which are especially susceptible because they have a strong propensity to memorise noisy labels. In this paper, we have examined the fundamental concept underlying related label noise approaches. A transition matrix estimator has been created, and its effectiveness against the actual transition matrix has been demonstrated. In addition, we examined the label noise robustness of two convolutional neural network classifiers with LeNet and AlexNet designs. The two FashionMINIST datasets have revealed the robustness of both models. We are not efficiently able to demonstrate the influence of the transition matrix noise correction on robustness enhancements due to our inability to correctly tune the complex convolutional neural network model due to time and computing resource constraints. There is a need for additional effort to fine-tune the neural network model and explore the precision of the estimated transition model in future research.
translated by 谷歌翻译
Double-blind peer review is considered a pillar of academic research because it is perceived to ensure a fair, unbiased, and fact-centered scientific discussion. Yet, experienced researchers can often correctly guess from which research group an anonymous submission originates, biasing the peer-review process. In this work, we present a transformer-based, neural-network architecture that only uses the text content and the author names in the bibliography to atttribute an anonymous manuscript to an author. To train and evaluate our method, we created the largest authorship-identification dataset to date. It leverages all research papers publicly available on arXiv amounting to over 2 million manuscripts. In arXiv-subsets with up to 2,000 different authors, our method achieves an unprecedented authorship attribution accuracy, where up to 95% of papers are attributed correctly. Thanks to our method, we are not only able to predict the author of an anonymous work but we also identify weaknesses of the double-blind review process by finding the key aspects that make a paper attributable. We believe that this work gives precious insights into how a submission can remain anonymous in order to support an unbiased double-blind review process.
translated by 谷歌翻译
Non-negative matrix factorisation (NMF) has been widely used to address the problem of corrupted data in images. The standard NMF algorithm minimises the Euclidean distance between the data matrix and the factorised approximation. Although this method has demonstrated good results, because it employs the squared error of each data point, the standard NMF algorithm is sensitive to outliers. In this paper, we theoretically analyse the robustness of the standard NMF, HCNMF and L2,1-NMF algorithms, and implement sets of experiments to show the robustness on real datasets, namely ORL and Extended YaleB. Our work demonstrates that different amounts of iterations are required for each algorithm to converge. Given the high computational complexity of these algorithms, our final models such as HCNMF and L2,1-NMF model do not successfully converge within the iteration parameters of this paper. Nevertheless, the experimental results still demonstrate the robustness of the aforementioned algorithms to some extent.
translated by 谷歌翻译