本文提出了一种基于图形神经网络(GNN)的新的Android恶意软件检测方法,并具有跳跃知识(JK)。Android函数呼叫图(FCGS)由一组程序功能及其术间调用组成。因此,本文提出了一种基于GNN的方法,用于通过捕获有意义的心理内呼叫路径模式来检测Android恶意软件的检测方法。此外,采用跳跃知识技术来最大程度地减少过度平滑问题的效果,这在GNN中很常见。该方法已使用两个基准数据集对所提出的方法进行了广泛的评估。结果表明,与关键分类指标相比,与最先进的方法相比,我们的方法的优越性,这证明了GNN在Android恶意软件检测和分类中的潜力。
translated by 谷歌翻译
A large number of network security breaches in IoT networks have demonstrated the unreliability of current Network Intrusion Detection Systems (NIDSs). Consequently, network interruptions and loss of sensitive data have occurred, which led to an active research area for improving NIDS technologies. In an analysis of related works, it was observed that most researchers aim to obtain better classification results by using a set of untried combinations of Feature Reduction (FR) and Machine Learning (ML) techniques on NIDS datasets. However, these datasets are different in feature sets, attack types, and network design. Therefore, this paper aims to discover whether these techniques can be generalised across various datasets. Six ML models are utilised: a Deep Feed Forward (DFF), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Decision Tree (DT), Logistic Regression (LR), and Naive Bayes (NB). The accuracy of three Feature Extraction (FE) algorithms; Principal Component Analysis (PCA), Auto-encoder (AE), and Linear Discriminant Analysis (LDA), are evaluated using three benchmark datasets: UNSW-NB15, ToN-IoT and CSE-CIC-IDS2018. Although PCA and AE algorithms have been widely used, the determination of their optimal number of extracted dimensions has been overlooked. The results indicate that no clear FE method or ML model can achieve the best scores for all datasets. The optimal number of extracted dimensions has been identified for each dataset, and LDA degrades the performance of the ML models on two datasets. The variance is used to analyse the extracted dimensions of LDA and PCA. Finally, this paper concludes that the choice of datasets significantly alters the performance of the applied techniques. We believe that a universal (benchmark) feature set is needed to facilitate further advancement and progress of research in this field.
translated by 谷歌翻译
在强化学习(RL)中,目标是获得最佳政策,最佳标准在根本上至关重要。两个主要的最优标准是平均奖励和打折的奖励。虽然后者更受欢迎,但在没有固有折扣概念的情况下,在环境中申请是有问题的。这促使我们重新审视a)动态编程中最佳标准的进步,b)人工折现因子的理由和复杂性,c)直接最大化平均奖励标准的好处,这是无折扣的。我们的贡献包括对平均奖励和打折奖励之间的关系以及对RL中的利弊的讨论之间的关系。我们强调的是,平均奖励RL方法具有将无折扣优化标准(Veinott,1969)应用于RL的成分和机制。
translated by 谷歌翻译
对于持续的环境,加固学习(RL)方法通常会以接近1的折扣因子最大化折扣奖励标准,以便近似于平均奖励(增益)。但是,这样的标准仅考虑长期稳态性能,忽略了瞬态状态的瞬态行为。在这项工作中,我们开发了一种优化增益的策略梯度方法,然后是偏差(这表明瞬态性能,并且重要的是从同等增益的策略中进行选择很重要)。我们得出表达式,可以为偏差的梯度及其预处理的Fisher矩阵进行采样。我们进一步设计了一种算法,该算法可以解决增益 - 然后偏置(BI级)优化。它的关键成分是RL特异性的对数屏障函数。实验结果提供了有关我们提案的基本机制的见解。
translated by 谷歌翻译
本文介绍了基于图形神经网络(GNN)的新的网络入侵检测系统(NID)。 GNN是深度神经网络的一个相对较新的子领域,可以利用基于图形数据的固有结构。 NIDS的培训和评估数据通常表示为流记录,其可以自然地以图形格式表示。这建立了探索网络入侵检测GNN的潜在和动力,这是本文的重点。基于机器的基于机器的NIDS的目前的研究只考虑网络流动,而不是考虑其互连的模式。这是检测复杂的物联网网络攻击的关键限制,例如IOT设备推出的DDOS和分布式端口扫描攻击。在本文中,我们提出了一种克服了这种限制的GNN方法,并允许捕获图形的边缘特征以及IOT网络中网络异常检测的拓扑信息。据我们所知,我们的方法是第一次成功,实用,广泛地评估应用图形神经网络对使用流基于流的数据的网络入侵检测问题的方法。我们在最近的四个NIDS基准数据集上进行了广泛的实验评估,表明我们的方法在关键分类指标方面占据了最先进的,这证明了网络入侵检测中GNN的潜力,并提供了进一步研究的动机。
translated by 谷歌翻译
Object detection models commonly deployed on uncrewed aerial systems (UAS) focus on identifying objects in the visible spectrum using Red-Green-Blue (RGB) imagery. However, there is growing interest in fusing RGB with thermal long wave infrared (LWIR) images to increase the performance of object detection machine learning (ML) models. Currently LWIR ML models have received less research attention, especially for both ground- and air-based platforms, leading to a lack of baseline performance metrics evaluating LWIR, RGB and LWIR-RGB fused object detection models. Therefore, this research contributes such quantitative metrics to the literature .The results found that the ground-based blended RGB-LWIR model exhibited superior performance compared to the RGB or LWIR approaches, achieving a mAP of 98.4%. Additionally, the blended RGB-LWIR model was also the only object detection model to work in both day and night conditions, providing superior operational capabilities. This research additionally contributes a novel labelled training dataset of 12,600 images for RGB, LWIR, and RGB-LWIR fused imagery, collected from ground-based and air-based platforms, enabling further multispectral machine-driven object detection research.
translated by 谷歌翻译
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes. Recent results in the literature show that representations learned by a single classifier over many classes are competitive on few-shot learning problems with representations learned by special-purpose algorithms designed for such problems. We offer an explanation for this phenomenon based on the concept of class-features variability collapse, which refers to the training dynamics of deep classification networks where the feature embeddings of samples belonging to the same class tend to concentrate around their class means. More specifically, we examine the few-shot error of the learned feature map, which is the classification error of the nearest class-center classifier using centers learned from a small number of random samples from each class. Assuming that the classes appearing in the data are selected independently from a distribution, we show that the few-shot error generalizes from the training data to unseen test data, and we provide an upper bound on the expected few-shot error for new classes (selected from the same distribution) using the average few-shot error for the source classes. Additionally, we show that the few-shot error on the training data can be upper bounded using the degree of class-features variability collapse. This suggests that foundation models can provide feature maps that are transferable to new downstream tasks even with limited data available.
translated by 谷歌翻译
One of the main challenges in deep learning-based underwater image enhancement is the limited availability of high-quality training data. Underwater images are difficult to capture and are often of poor quality due to the distortion and loss of colour and contrast in water. This makes it difficult to train supervised deep learning models on large and diverse datasets, which can limit the model's performance. In this paper, we explore an alternative approach to supervised underwater image enhancement. Specifically, we propose a novel unsupervised underwater image enhancement framework that employs a conditional variational autoencoder (cVAE) to train a deep learning model with probabilistic adaptive instance normalization (PAdaIN) and statistically guided multi-colour space stretch that produces realistic underwater images. The resulting framework is composed of a U-Net as a feature extractor and a PAdaIN to encode the uncertainty, which we call UDnet. To improve the visual quality of the images generated by UDnet, we use a statistically guided multi-colour space stretch module that ensures visual consistency with the input image and provides an alternative to training using a ground truth image. The proposed model does not need manual human annotation and can learn with a limited amount of data and achieves state-of-the-art results on underwater images. We evaluated our proposed framework on eight publicly-available datasets. The results show that our proposed framework yields competitive performance compared to other state-of-the-art approaches in quantitative as well as qualitative metrics. Code available at https://github.com/alzayats/UDnet .
translated by 谷歌翻译
Multi-modal image-text models such as CLIP and LiT have demonstrated impressive performance on image classification benchmarks and their zero-shot generalization ability is particularly exciting. While the top-5 zero-shot accuracies of these models are very high, the top-1 accuracies are much lower (over 25% gap in some cases). We investigate the reasons for this performance gap and find that many of the failure cases are caused by ambiguity in the text prompts. First, we develop a simple and efficient zero-shot post-hoc method to identify images whose top-1 prediction is likely to be incorrect, by measuring consistency of the predictions w.r.t. multiple prompts and image transformations. We show that our procedure better predicts mistakes, outperforming the popular max logit baseline on selective prediction tasks. Next, we propose a simple and efficient way to improve accuracy on such uncertain images by making use of the WordNet hierarchy; specifically we augment the original class by incorporating its parent and children from the semantic label hierarchy, and plug the augmentation into text promts. We conduct experiments on both CLIP and LiT models with five different ImageNet-based datasets. For CLIP, our method improves the top-1 accuracy by 17.13% on the uncertain subset and 3.6% on the entire ImageNet validation set. We also show that our method improves across ImageNet shifted datasets and other model architectures such as LiT. Our proposed method is hyperparameter-free, requires no additional model training and can be easily scaled to other large multi-modal architectures.
translated by 谷歌翻译
3D reconstruction and novel view synthesis of dynamic scenes from collections of single views recently gained increased attention. Existing work shows impressive results for synthetic setups and forward-facing real-world data, but is severely limited in the training speed and angular range for generating novel views. This paper addresses these limitations and proposes a new method for full 360{\deg} novel view synthesis of non-rigidly deforming scenes. At the core of our method are: 1) An efficient deformation module that decouples the processing of spatial and temporal information for acceleration at training and inference time; and 2) A static module representing the canonical scene as a fast hash-encoded neural radiance field. We evaluate the proposed approach on the established synthetic D-NeRF benchmark, that enables efficient reconstruction from a single monocular view per time-frame randomly sampled from a full hemisphere. We refer to this form of inputs as monocularized data. To prove its practicality for real-world scenarios, we recorded twelve challenging sequences with human actors by sampling single frames from a synchronized multi-view rig. In both cases, our method is trained significantly faster than previous methods (minutes instead of days) while achieving higher visual accuracy for generated novel views. Our source code and data is available at our project page https://graphics.tu-bs.de/publications/kappel2022fast.
translated by 谷歌翻译