智能论文笔记

IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective

Li Yang , Abdallah Shami

分类：机器学习

2022-09-16

近年来，随着传感器和智能设备的广泛传播，物联网（IoT）系统的数据生成速度已大大增加。在物联网系统中，必须经常处理，转换和分析大量数据，以实现各种物联网服务和功能。机器学习（ML）方法已显示出其物联网数据分析的能力。但是，将ML模型应用于物联网数据分析任务仍然面临许多困难和挑战，特别是有效的模型选择，设计/调整和更新，这给经验丰富的数据科学家带来了巨大的需求。此外，物联网数据的动态性质可能引入概念漂移问题，从而导致模型性能降解。为了减少人类的努力，自动化机器学习（AUTOML）已成为一个流行的领域，旨在自动选择，构建，调整和更新机器学习模型，以在指定任务上实现最佳性能。在本文中，我们对Automl区域中模型选择，调整和更新过程中的现有方法进行了审查，以识别和总结将ML算法应用于IoT数据分析的每个步骤的最佳解决方案。为了证明我们的发现并帮助工业用户和研究人员更好地实施汽车方法，在这项工作中提出了将汽车应用于IoT异常检测问题的案例研究。最后，我们讨论并分类了该领域的挑战和研究方向。

translated by 谷歌翻译

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

Bernd Bischl , Martin Binder , Michel Lang , Tobias Pielok , Jakob Richter , Stefan Coors , Janek Thomas , Theresa Ullmann , Marc Becker , Anne-Laure Boulesteix

分类： (统计)机器学习 | 机器学习

2021-07-13

大多数机器学习算法由一个或多个超参数配置，必须仔细选择并且通常会影响性能。为避免耗时和不可递销的手动试验和错误过程来查找性能良好的超参数配置，可以采用各种自动超参数优化（HPO）方法，例如，基于监督机器学习的重新采样误差估计。本文介绍了HPO后，本文审查了重要的HPO方法，如网格或随机搜索，进化算法，贝叶斯优化，超带和赛车。它给出了关于进行HPO的重要选择的实用建议，包括HPO算法本身，性能评估，如何将HPO与ML管道，运行时改进和并行化结合起来。这项工作伴随着附录，其中包含关于R和Python的特定软件包的信息，以及用于特定学习算法的信息和推荐的超参数搜索空间。我们还提供笔记本电脑，这些笔记本展示了这项工作的概念作为补充文件。

translated by 谷歌翻译

A survey on concept drift adaptation

分类：

Concept drift primarily refers to an online supervised learning scenario when the relation between the input data and the target variable changes over time. Assuming a general knowledge of supervised learning in this paper we characterize adaptive learning process, categorize existing strategies for handling concept drift, overview the most representative, distinct and popular techniques and algorithms, discuss evaluation methodology of adaptive algorithms, and present a set of illustrative applications. The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state-of-the-art. Thus, it aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts and practitioners.

translated by 谷歌翻译

A Dependable Hybrid Machine Learning Model for Network Intrusion Detection

Md. Alamin Talukder , Khondokar Fida Hasan , Md. Manowarul Islam , Md Ashraf Uddin , Arnisha Akhter , Mohammand Abu Yousuf , Fares Alharbi , Mohammad Ali Moni

分类：机器学习

2022-12-08

Network intrusion detection systems (NIDSs) play an important role in computer network security. There are several detection mechanisms where anomaly-based automated detection outperforms others significantly. Amid the sophistication and growing number of attacks, dealing with large amounts of data is a recognized issue in the development of anomaly-based NIDS. However, do current models meet the needs of today's networks in terms of required accuracy and dependability? In this research, we propose a new hybrid model that combines machine learning and deep learning to increase detection rates while securing dependability. Our proposed method ensures efficient pre-processing by combining SMOTE for data balancing and XGBoost for feature selection. We compared our developed method to various machine learning and deep learning algorithms to find a more efficient algorithm to implement in the pipeline. Furthermore, we chose the most effective model for network intrusion based on a set of benchmarked performance analysis criteria. Our method produces excellent results when tested on two datasets, KDDCUP'99 and CIC-MalMem-2022, with an accuracy of 99.99% and 100% for KDDCUP'99 and CIC-MalMem-2022, respectively, and no overfitting or Type-1 and Type-2 issues.

translated by 谷歌翻译

Survey of Machine Learning Based Intrusion Detection Methods for Internet of Medical Things

Ayoub Si-Ahmed , Mohammed Ali Al-Garadi , Narhimene Boustia

分类：机器学习

2022-02-19

医学事物互联网（IOMT）允许使用传感器收集生理数据，然后将其传输到远程服务器，这使医生和卫生专业人员可以连续，永久地分析这些数据，并在早期阶段检测疾病。但是，使用无线通信传输数据将其暴露于网络攻击中，并且该数据的敏感和私人性质可能代表了攻击者的主要兴趣。在存储和计算能力有限的设备上使用传统的安全方法无效。另一方面，使用机器学习进行入侵检测可以对IOMT系统的要求提供适应性的安全响应。在这种情况下，对基于机器学习（ML）的入侵检测系统如何解决IOMT系统中的安全性和隐私问题的全面调查。为此，提供了IOMT的通用三层体系结构以及IOMT系统的安全要求。然后，出现了可能影响IOMT安全性的各种威胁，并确定基于ML的每个解决方案中使用的优势，缺点，方法和数据集。最后，讨论了在IOMT的每一层中应用ML的一些挑战和局限性，这些挑战和局限性可以用作未来的研究方向。

translated by 谷歌翻译

Outlier Detection using AI: A Survey

Md Nazmul Kabir Sikder , Feras A. Batarseh

分类：机器学习 | 人工智能 | (统计)机器学习

2021-12-01

异常值是一个事件或观察，其被定义为不同于距群体的不规则距离的异常活动，入侵或可疑数据点。然而，异常事件的定义是主观的，取决于应用程序和域（能量，健康，无线网络等）。重要的是要尽可能仔细地检测异常事件，以避免基础设施故障，因为异常事件可能导致对基础设施的严重损坏。例如，诸如微电网的网络物理系统的攻击可以发起电压或频率不稳定性，从而损坏涉及非常昂贵的修复的智能逆变器。微电网中的不寻常活动可以是机械故障，行为在系统中发生变化，人体或仪器错误或恶意攻击。因此，由于其可变性，异常值检测（OD）是一个不断增长的研究领域。在本章中，我们讨论了使用AI技术的OD方法的进展。为此，通过多个类别引入每个OD模型的基本概念。广泛的OD方法分为六大类：基于统计，基于距离，基于密度的，基于群集的，基于学习的和合奏方法。对于每个类别，我们讨论最近最先进的方法，他们的应用领域和表演。之后，关于对未来研究方向的建议提供了关于各种技术的优缺点和挑战的简要讨论。该调查旨在指导读者更好地了解OD方法的最新进展，以便保证AI。

translated by 谷歌翻译

Deep Learning for Time Series Anomaly Detection: A Survey

Zahra Zamanzadeh Darban , Geoffrey I. Webb , Shirui Pan , Charu C. Aggarwal , Mahsa Salehi

分类：机器学习 | 人工智能

2022-11-09

Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.

translated by 谷歌翻译

Machine Learning in Access Control: A Taxonomy and Survey

Mohammad Nur Nobi , Maanak Gupta , Lopamudra Praharaj , Mahmoud Abdelsalam , Ram Krishnan , Ravi Sandhu

分类：机器学习

2022-07-04

越来越多的工作已经认识到利用机器学习（ML）进步的重要性，以满足提取访问控制属性，策略挖掘，策略验证，访问决策等有效自动化的需求。在这项工作中，我们调查和总结了各种ML解决不同访问控制问题的方法。我们提出了ML模型在访问控制域中应用的新分类学。我们重点介绍当前的局限性和公开挑战，例如缺乏公共现实世界数据集，基于ML的访问控制系统的管理，了解黑盒ML模型的决策等，并列举未来的研究方向。

translated by 谷歌翻译

Learning under Concept Drift: A Review

Jie Lu , Anjin Liu , Fan Dong , Feng Gu , Joao Gama , Guangquan Zhang

分类：

2020-04-13

Concept drift describes unforeseeable changes in the underlying distribution of streaming data over time. Concept drift research involves the development of methodologies and techniques for drift detection, understanding and adaptation. Data analysis has revealed that machine learning in a concept drift environment will result in poor learning results if the drift is not addressed. To help researchers identify which research topics are significant and how to apply related techniques in data analysis tasks, it is necessary that a high quality, instructive review of current research developments and trends in the concept drift field is conducted. In addition, due to the rapid development of concept drift in recent years, the methodologies of learning under concept drift have become noticeably systematic, unveiling a framework which has not been mentioned in literature. This paper reviews over 130 high quality publications in concept drift related research areas, analyzes up-to-date developments in methodologies and techniques, and establishes a framework of learning under concept drift including three main components: concept drift detection, concept drift understanding, and concept drift adaptation. This paper lists and discusses 10 popular synthetic datasets and 14 publicly available benchmark datasets used for evaluating the performance of learning algorithms aiming at handling concept drift. Also, concept drift related research directions are covered and discussed. By providing state-of-the-art knowledge, this survey will directly support researchers in their understanding of research developments in the field of learning under concept drift.

translated by 谷歌翻译

Multi-Objective Hyperparameter Optimization -- An Overview

Florian Karl , Tobias Pielok , Julia Moosbauer , Florian Pfisterer , Stefan Coors , Martin Binder , Lennart Schneider , Janek Thomas , Jakob Richter , Michel Lang

分类：机器学习 | (统计)机器学习

2022-06-15

超参数优化构成了典型的现代机器学习工作流程的很大一部分。这是由于这样一个事实，即机器学习方法和相应的预处理步骤通常只有在正确调整超参数时就会产生最佳性能。但是在许多应用中，我们不仅有兴趣仅仅为了预测精度而优化ML管道；确定最佳配置时，必须考虑其他指标或约束，从而导致多目标优化问题。由于缺乏知识和用于多目标超参数优化的知识和容易获得的软件实现，因此通常在实践中被忽略。在这项工作中，我们向读者介绍了多个客观超参数优化的基础知识，并激励其在应用ML中的实用性。此外，我们从进化算法和贝叶斯优化的领域提供了现有优化策略的广泛调查。我们说明了MOO在几个特定ML应用中的实用性，考虑了诸如操作条件，预测时间，稀疏，公平，可解释性和鲁棒性之类的目标。

translated by 谷歌翻译

A Survey of Open Source Automation Tools for Data Science Predictions

Nicholas Hoell

分类：机器学习

2022-08-24

我们介绍了数据科学预测生命周期中各个阶段开发和采用自动化的技术和文化挑战的说明概述，从而将重点限制为使用结构化数据集的监督学习。此外，我们回顾了流行的开源Python工具，这些工具实施了针对自动化挑战的通用解决方案模式，并突出了我们认为进步仍然需要的差距。

translated by 谷歌翻译

HTML版本

Machine Learning for Microcontroller-Class Hardware -- A Review

Swapnil Sayan Saha , Sandeep Singh Sandha , Mani Srivastava

分类：机器学习

2022-05-29

机器学习的进步为低端互联网节点（例如微控制器）带来了新的机会，将情报带入了情报。传统的机器学习部署具有较高的记忆力，并计算足迹阻碍了其在超资源约束的微控制器上的直接部署。本文强调了为MicroController类设备启用机载机器学习的独特要求。研究人员为资源有限的应用程序使用专门的模型开发工作流程，以确保计算和延迟预算在设备限制之内，同时仍保持所需的性能。我们表征了微控制器类设备的机器学习模型开发的广泛适用的闭环工作流程，并表明几类应用程序采用了它的特定实例。我们通过展示多种用例，将定性和数值见解介绍到模型开发的不同阶段。最后，我们确定了开放的研究挑战和未解决的问题，要求仔细考虑前进。

translated by 谷歌翻译

Explainable Intrusion Detection Systems (X-IDS): A Survey of Current Methods, Challenges, and Opportunities

Subash Neupane , Jesse Ables , William Anderson , Sudip Mittal , Shahram Rahimi , Ioana Banicescu , Maria Seale

分类：人工智能

2022-07-13

人工智能（AI）和机器学习（ML）在网络安全挑战中的应用已在行业和学术界的吸引力，部分原因是对关键系统（例如云基础架构和政府机构）的广泛恶意软件攻击。入侵检测系统（IDS）使用某些形式的AI，由于能够以高预测准确性处理大量数据，因此获得了广泛的采用。这些系统托管在组织网络安全操作中心（CSOC）中，作为一种防御工具，可监视和检测恶意网络流，否则会影响机密性，完整性和可用性（CIA）。 CSOC分析师依靠这些系统来决定检测到的威胁。但是，使用深度学习（DL）技术设计的IDS通常被视为黑匣子模型，并且没有为其预测提供理由。这为CSOC分析师造成了障碍，因为他们无法根据模型的预测改善决策。解决此问题的一种解决方案是设计可解释的ID（X-IDS）。这项调查回顾了可解释的AI（XAI）的最先进的ID，目前的挑战，并讨论了这些挑战如何涉及X-ID的设计。特别是，我们全面讨论了黑匣子和白盒方法。我们还在这些方法之间的性能和产生解释的能力方面提出了权衡。此外，我们提出了一种通用体系结构，该建筑认为人类在循环中，该架构可以用作设计X-ID时的指南。研究建议是从三个关键观点提出的：需要定义ID的解释性，需要为各种利益相关者量身定制的解释以及设计指标来评估解释的需求。

translated by 谷歌翻译

Exploring the Use of Data-Driven Approaches for Anomaly Detection in the Internet of Things (IoT) Environment

Eleonora Achiluzzi , Menglu Li , Md Fahd Al Georgy , Rasha Kashef

分类：机器学习

2022-12-31

The Internet of Things (IoT) is a system that connects physical computing devices, sensors, software, and other technologies. Data can be collected, transferred, and exchanged with other devices over the network without requiring human interactions. One challenge the development of IoT faces is the existence of anomaly data in the network. Therefore, research on anomaly detection in the IoT environment has become popular and necessary in recent years. This survey provides an overview to understand the current progress of the different anomaly detection algorithms and how they can be applied in the context of the Internet of Things. In this survey, we categorize the widely used anomaly detection machine learning and deep learning techniques in IoT into three types: clustering-based, classification-based, and deep learning based. For each category, we introduce some state-of-the-art anomaly detection methods and evaluate the advantages and limitations of each technique.

translated by 谷歌翻译

When Machine Learning Meets Spectrum Sharing Security: Methodologies and Challenges

Qun Wang , Haijian Sun , Rose Qingyang Hu , Arupjyoti Bhuyan

分类：机器学习

2022-01-12

互联网连接系统的指数增长产生了许多挑战，例如频谱短缺问题，需要有效的频谱共享（SS）解决方案。复杂和动态的SS系统可以接触不同的潜在安全性和隐私问题，需要保护机制是自适应，可靠和可扩展的。基于机器学习（ML）的方法经常提议解决这些问题。在本文中，我们对最近的基于ML的SS方法，最关键的安全问题和相应的防御机制提供了全面的调查。特别是，我们详细说明了用于提高SS通信系统的性能的最先进的方法，包括基于ML基于ML的基于的数据库辅助SS网络，ML基于基于的数据库辅助SS网络，包括基于ML的数据库辅助的SS网络，基于ML的LTE-U网络，基于ML的环境反向散射网络和其他基于ML的SS解决方案。我们还从物理层和基于ML算法的相应防御策略的安全问题，包括主要用户仿真（PUE）攻击，频谱感测数据伪造（SSDF）攻击，干扰攻击，窃听攻击和隐私问题。最后，还给出了对ML基于ML的开放挑战的广泛讨论。这种全面的审查旨在为探索新出现的ML的潜力提供越来越复杂的SS及其安全问题，提供基础和促进未来的研究。

translated by 谷歌翻译

A Survey of Machine Learning for Computer Architecture and Systems

Nan Wu , Yuan Xie

分类：机器学习

2021-02-16

计算机架构和系统已优化了很长时间，以便高效执行机器学习（ML）模型。现在，是时候重新考虑ML和系统之间的关系，并让ML转换计算机架构和系统的设计方式。这有一个双重含义：改善设计师的生产力，以及完成良性周期。在这篇论文中，我们对应用ML进行计算机架构和系统设计的工作进行了全面的审查。首先，我们考虑ML技术在架构/系统设计中的典型作用，即快速预测建模或设计方法，我们执行高级分类学。然后，我们总结了通过ML技术解决的计算机架构/系统设计中的常见问题，并且所用典型的ML技术来解决它们中的每一个。除了在狭义中强调计算机架构外，我们采用数据中心可被认为是仓库规模计算机的概念;粗略的计算机系统中提供粗略讨论，例如代码生成和编译器;我们还注意ML技术如何帮助和改造设计自动化。我们进一步提供了对机会和潜在方向的未来愿景，并设想应用ML的计算机架构和系统将在社区中蓬勃发展。

translated by 谷歌翻译

Explainable AI over the Internet of Things (IoT): Overview, State-of-the-Art and Future Directions

Senthil Kumar Jagatheesaperumal , Quoc-Viet Pham , Rukhsana Ruby , Zhaohui Yang , Chunmei Xu , Zhaoyang Zhang

分类：人工智能 | 机器学习

2022-11-02

Explainable Artificial Intelligence (XAI) is transforming the field of Artificial Intelligence (AI) by enhancing the trust of end-users in machines. As the number of connected devices keeps on growing, the Internet of Things (IoT) market needs to be trustworthy for the end-users. However, existing literature still lacks a systematic and comprehensive survey work on the use of XAI for IoT. To bridge this lacking, in this paper, we address the XAI frameworks with a focus on their characteristics and support for IoT. We illustrate the widely-used XAI services for IoT applications, such as security enhancement, Internet of Medical Things (IoMT), Industrial IoT (IIoT), and Internet of City Things (IoCT). We also suggest the implementation choice of XAI models over IoT systems in these applications with appropriate examples and summarize the key inferences for future works. Moreover, we present the cutting-edge development in edge XAI structures and the support of sixth-generation (6G) communication services for IoT applications, along with key inferences. In a nutshell, this paper constitutes the first holistic compilation on the development of XAI-based frameworks tailored for the demands of future IoT use cases.

translated by 谷歌翻译

Leak Detection in Natural Gas Pipeline Using Machine Learning Models

Adebayo Oshingbesan

分类：机器学习

2022-09-21

天然气管道中的泄漏检测是石油和天然气行业的一个重要且持续的问题。这尤其重要，因为管道是运输天然气的最常见方法。这项研究旨在研究数据驱动的智能模型使用基本操作参数检测天然气管道的小泄漏的能力，然后使用现有的性能指标比较智能模型。该项目应用观察者设计技术，使用回归分类层次模型来检测天然气管道中的泄漏，其中智能模型充当回归器，并且修改后的逻辑回归模型充当分类器。该项目使用四个星期的管道数据流研究了五个智能模型（梯度提升，决策树，随机森林，支持向量机和人工神经网络）。结果表明，虽然支持向量机和人工神经网络比其他网络更好，但由于其内部复杂性和所使用的数据量，它们并未提供最佳的泄漏检测结果。随机森林和决策树模型是最敏感的，因为它们可以在大约2小时内检测到标称流量的0.1％的泄漏。所有智能模型在测试阶段中具有高可靠性，错误警报率为零。将所有智能模型泄漏检测的平均时间与文献中的实时短暂模型进行了比较。结果表明，智能模型在泄漏检测问题中的表现相对较好。该结果表明，可以与实时瞬态模型一起使用智能模型，以显着改善泄漏检测结果。

translated by 谷歌翻译

Deep Learning-Driven Edge Video Analytics: A Survey

Renjie Xu , Saiedeh Razavi , Rong Zheng

分类：计算机视觉 | 机器学习

2022-11-28

Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.

translated by 谷歌翻译

Deep Learning -- A first Meta-Survey of selected Reviews across Scientific Disciplines, their Commonalities, Challenges and Research Impact

Jan Egger , Antonio Pepe , Christina Gsaxner , Yuan Jin , Jianning Li , Roman Kern

分类：计算机视觉 | 机器学习 | 神经与进化计算

2020-11-16

深度学习属于人工智能领域，机器执行通常需要某种人类智能的任务。类似于大脑的基本结构，深度学习算法包括一种人工神经网络，其类似于生物脑结构。利用他们的感官模仿人类的学习过程，深入学习网络被送入（感官）数据，如文本，图像，视频或声音。这些网络在不同的任务中优于最先进的方法，因此，整个领域在过去几年中看到了指数增长。这种增长在过去几年中每年超过10,000多种出版物。例如，只有在医疗领域中的所有出版物中覆盖的搜索引擎只能在Q3 2020中覆盖所有出版物的子集，用于搜索术语“深度学习”，其中大约90％来自过去三年。因此，对深度学习领域的完全概述已经不可能在不久的将来获得，并且在不久的将来可能会难以获得难以获得子场的概要。但是，有几个关于深度学习的综述文章，这些文章专注于特定的科学领域或应用程序，例如计算机愿景的深度学习进步或在物体检测等特定任务中进行。随着这些调查作为基础，这一贡献的目的是提供对不同科学学科的深度学习的第一个高级，分类的元调查。根据底层数据来源（图像，语言，医疗，混合）选择了类别（计算机愿景，语言处理，医疗信息和其他工程）。此外，我们还审查了每个子类别的常见架构，方法，专业，利弊，评估，挑战和未来方向。

translated by 谷歌翻译