智能论文笔记

Deep Reinforcement Learning for Wind and Energy Storage Coordination in Wholesale Energy and Ancillary Service Markets

Jinhao Li , Changlong Wang , Hao Wang

分类：机器学习

2022-12-27

Global power systems are increasingly reliant on wind energy as a mitigation strategy for climate change. However, the variability of wind energy causes system reliability to erode, resulting in the wind being curtailed and, ultimately, leading to substantial economic losses for wind farm owners. Wind curtailment can be reduced using battery energy storage systems (BESS) that serve as onsite backup sources. Yet, this auxiliary role may significantly hamper the BESS's capacity to generate revenues from the electricity market, particularly in conducting energy arbitrage in the Spot market and providing frequency control ancillary services (FCAS) in the FCAS markets. Ideal BESS scheduling should effectively balance the BESS's role in absorbing onsite wind curtailment and trading in the electricity market, but it is difficult in practice because of the underlying coordination complexity and the stochastic nature of energy prices and wind generation. In this study, we investigate the bidding strategy of a wind-battery system co-located and participating simultaneously in both the Spot and Regulation FCAS markets. We propose a deep reinforcement learning (DRL)-based approach that decouples the market participation of the wind-battery system into two related Markov decision processes for each facility, enabling the BESS to absorb onsite wind curtailment while simultaneously bidding in the wholesale Spot and FCAS markets to maximize overall operational revenues. Using realistic wind farm data, we validated the coordinated bidding strategy for the wind-battery system and find that our strategy generates significantly higher revenue and responds better to wind curtailment compared to an optimization-based benchmark. Our results show that joint-market bidding can significantly improve the financial performance of wind-battery systems compared to individual market participation.

translated by 谷歌翻译

Proximal Policy Optimization Based Reinforcement Learning for Joint Bidding in Energy and Frequency Regulation Markets

Muhammad Anwar , Changlong Wang , Frits de Nijs , Hao Wang

分类：人工智能 | 机器学习

2022-12-13

Driven by the global decarbonization effort, the rapid integration of renewable energy into the conventional electricity grid presents new challenges and opportunities for the battery energy storage system (BESS) participating in the energy market. Energy arbitrage can be a significant source of revenue for the BESS due to the increasing price volatility in the spot market caused by the mismatch between renewable generation and electricity demand. In addition, the Frequency Control Ancillary Services (FCAS) markets established to stabilize the grid can offer higher returns for the BESS due to their capability to respond within milliseconds. Therefore, it is crucial for the BESS to carefully decide how much capacity to assign to each market to maximize the total profit under uncertain market conditions. This paper formulates the bidding problem of the BESS as a Markov Decision Process, which enables the BESS to participate in both the spot market and the FCAS market to maximize profit. Then, Proximal Policy Optimization, a model-free deep reinforcement learning algorithm, is employed to learn the optimal bidding strategy from the dynamic environment of the energy market under a continuous bidding scale. The proposed model is trained and validated using real-world historical data of the Australian National Electricity Market. The results demonstrate that our developed joint bidding strategy in both markets is significantly profitable compared to individual markets.

translated by 谷歌翻译

Battery and Hydrogen Energy Storage Control in a Smart Energy Network with Flexible Energy Demand using Deep Reinforcement Learning

Cephas Samende , Zhong Fan , Jun Cao

分类：人工智能 | 机器学习

2022-08-26

智能能源网络提供了一种有效的手段，可容纳可变可再生能源（例如太阳能和风能）的高渗透率，这是能源生产深度脱碳的关键。但是，鉴于可再生能源以及能源需求的可变性，必须制定有效的控制和能源存储方案来管理可变的能源产生并实现所需的系统经济学和环境目标。在本文中，我们引入了由电池和氢能存储组成的混合储能系统，以处理与电价，可再生能源生产和消费有关的不确定性。我们旨在提高可再生能源利用率，并最大程度地减少能源成本和碳排放，同时确保网络内的能源可靠性和稳定性。为了实现这一目标，我们提出了一种多代理的深层确定性政策梯度方法，这是一种基于强化的基于强化学习的控制策略，可实时优化混合能源存储系统和能源需求的调度。提出的方法是无模型的，不需要明确的知识和智能能源网络环境的严格数学模型。基于现实世界数据的仿真结果表明：（i）混合储能系统和能源需求的集成和优化操作可将碳排放量减少78.69％，将成本节省的成本储蓄提高23.5％，可续订的能源利用率比13.2％以上。其他基线模型和（ii）所提出的算法优于最先进的自学习算法，例如Deep-Q网络。

translated by 谷歌翻译

HTML版本

Deep Reinforcement Learning Microgrid Optimization Strategy Considering Priority Flexible Demand Side

Jinsong Sang , Hongbin Sun , Lei Kou

分类：机器学习 | 人工智能

2022-11-11

As an efficient way to integrate multiple distributed energy resources and the user side, a microgrid is mainly faced with the problems of small-scale volatility, uncertainty, intermittency and demand-side uncertainty of DERs. The traditional microgrid has a single form and cannot meet the flexible energy dispatch between the complex demand side and the microgrid. In response to this problem, the overall environment of wind power, thermostatically controlled loads, energy storage systems, price-responsive loads and the main grid is proposed. Secondly, the centralized control of the microgrid operation is convenient for the control of the reactive power and voltage of the distributed power supply and the adjustment of the grid frequency. However, there is a problem in that the flexible loads aggregate and generate peaks during the electricity price valley. The existing research takes into account the power constraints of the microgrid and fails to ensure a sufficient supply of electric energy for a single flexible load. This paper considers the response priority of each unit component of TCLs and ESSs on the basis of the overall environment operation of the microgrid so as to ensure the power supply of the flexible load of the microgrid and save the power input cost to the greatest extent. Finally, the simulation optimization of the environment can be expressed as a Markov decision process process. It combines two stages of offline and online operations in the training process. The addition of multiple threads with the lack of historical data learning leads to low learning efficiency. The asynchronous advantage actor-critic with the experience replay pool memory library is added to solve the data correlation and nonstatic distribution problems during training.

translated by 谷歌翻译

Distributed Energy Management and Demand Response in Smart Grids: A Multi-Agent Deep Reinforcement Learning Framework

Amin Shojaeighadikolaei , Arman Ghasemi , Kailani Jones , Yousif Dafalla , Alexandru G. Bardas , Reza Ahmadi , Morteza Haashemi

分类：机器学习

2022-11-29

This paper presents a multi-agent Deep Reinforcement Learning (DRL) framework for autonomous control and integration of renewable energy resources into smart power grid systems. In particular, the proposed framework jointly considers demand response (DR) and distributed energy management (DEM) for residential end-users. DR has a widely recognized potential for improving power grid stability and reliability, while at the same time reducing end-users energy bills. However, the conventional DR techniques come with several shortcomings, such as the inability to handle operational uncertainties while incurring end-user disutility, which prevents widespread adoption in real-world applications. The proposed framework addresses these shortcomings by implementing DR and DEM based on real-time pricing strategy that is achieved using deep reinforcement learning. Furthermore, this framework enables the power grid service provider to leverage distributed energy resources (i.e., PV rooftop panels and battery storage) as dispatchable assets to support the smart grid during peak hours, thus achieving management of distributed energy resources. Simulation results based on the Deep Q-Network (DQN) demonstrate significant improvements of the 24-hour accumulative profit for both prosumers and the power grid service provider, as well as major reductions in the utilization of the power grid reserve generators.

translated by 谷歌翻译

Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning

Cesare Caputo , Michel-Alexandre Cardin , Pudong Ge , Fei Teng , Anna Korre , Ehecatl Antonio del Rio Chanona

分类：人工智能

2022-12-08

Ongoing risks from climate change have impacted the livelihood of global nomadic communities, and are likely to lead to increased migratory movements in coming years. As a result, mobility considerations are becoming increasingly important in energy systems planning, particularly to achieve energy access in developing countries. Advanced Plug and Play control strategies have been recently developed with such a decentralized framework in mind, more easily allowing for the interconnection of nomadic communities, both to each other and to the main grid. In light of the above, the design and planning strategy of a mobile multi-energy supply system for a nomadic community is investigated in this work. Motivated by the scale and dimensionality of the associated uncertainties, impacting all major design and decision variables over the 30-year planning horizon, Deep Reinforcement Learning (DRL) is implemented for the design and planning problem tackled. DRL based solutions are benchmarked against several rigid baseline design options to compare expected performance under uncertainty. The results on a case study for ger communities in Mongolia suggest that mobile nomadic energy systems can be both technically and economically feasible, particularly when considering flexibility, although the degree of spatial dispersion among households is an important limiting factor. Key economic, sustainability and resilience indicators such as Cost, Equivalent Emissions and Total Unmet Load are measured, suggesting potential improvements compared to available baselines of up to 25%, 67% and 76%, respectively. Finally, the decomposition of values of flexibility and plug and play operation is presented using a variation of real options theory, with important implications for both nomadic communities and policymakers focused on enabling their energy access.

translated by 谷歌翻译

An intelligent algorithmic trading based on a risk-return reinforcement learning algorithm

Boyi Jin

分类：机器学习

2022-08-23

这篇科学论文提出了一种新型的投资组合优化模型，使用改进的深钢筋学习算法。优化模型的目标函数是投资组合累积回报的期望和价值的加权总和。所提出的算法基于参与者 - 批判性架构，其中关键网络的主要任务是使用分位数回归学习投资组合累积返回的分布，而Actor网络通过最大化上述目标函数来输出最佳投资组合权重。同时，我们利用线性转换功能来实现资产短销售。最后，使用了一种称为APE-X的多进程方法来加速深度强化学习训练的速度。为了验证我们提出的方法，我们对两个代表性的投资组合进行了重新测试，并观察到这项工作中提出的模型优于基准策略。

translated by 谷歌翻译

Joint Energy Dispatch and Unit Commitment in Microgrids Based on Deep Reinforcement Learning

Jiaju Qi , Lei Lei , Kan Zheng , Simon X. Yang

分类：机器学习 | 人工智能

2022-06-03

如今，微电网（MG）具有可再生能源的应用越来越广泛，这对动态能量管理产生了强烈的需求。在本文中，深入强化学习（DRL）用于学习最佳政策，以在孤立的毫克中制定联合能源调度（ED）和单位承诺（UC）决策，目的是降低前提的总发电成本确保供求余额。为了克服因联合ED和UC引起的离散连续混合动作空间的挑战，我们提出了DRL算法，即混合动作有限的Horizon DDPG（HAFH-DDPG），该算法无缝地集成了两个经典的DRL算法，即。，基于有限的horizon动态编程（DP）框架，深Q网络（DQN）和深层确定性策略梯度（DDPG）。此外，提出了柴油发电机（DG）选择策略，以支持简化的动作空间，以降低该算法的计算复杂性。最后，通过与现实世界数据集的实验相比，通过与多种基线算法进行比较来验证我们所提出的算法的有效性。

translated by 谷歌翻译

Optimal Planning of Hybrid Energy Storage Systems using Curtailed Renewable Energy through Deep Reinforcement Learning

Dongju Kang , Doeun Kang , Sumin Hwangbo , Haider Niaz , Won Bo Lee , J. Jay Liu , Jonggeol Na

分类：机器学习

2022-12-12

Energy management systems (EMS) are becoming increasingly important in order to utilize the continuously growing curtailed renewable energy. Promising energy storage systems (ESS), such as batteries and green hydrogen should be employed to maximize the efficiency of energy stakeholders. However, optimal decision-making, i.e., planning the leveraging between different strategies, is confronted with the complexity and uncertainties of large-scale problems. Here, we propose a sophisticated deep reinforcement learning (DRL) methodology with a policy-based algorithm to realize the real-time optimal ESS planning under the curtailed renewable energy uncertainty. A quantitative performance comparison proved that the DRL agent outperforms the scenario-based stochastic optimization (SO) algorithm, even with a wide action and observation space. Owing to the uncertainty rejection capability of the DRL, we could confirm a robust performance, under a large uncertainty of the curtailed renewable energy, with a maximizing net profit and stable system. Action-mapping was performed for visually assessing the action taken by the DRL agent according to the state. The corresponding results confirmed that the DRL agent learns the way like what a human expert would do, suggesting reliable application of the proposed methodology.

translated by 谷歌翻译

Renewable energy integration and microgrid energy trading using multi-agent deep reinforcement learning

Daniel J. B. Harrold , Jun Cao , Zhong Fan

分类：人工智能 | 机器学习

2021-11-21

在本文中，多种子体增强学习用于控制混合能量存储系统，通过最大化可再生能源和交易的价值来降低微电网的能量成本。该代理商必须学习在波动需求，动态批发能源价格和不可预测的可再生能源中，控制三种不同类型的能量存储系统。考虑了两种案例研究：首先看能量存储系统如何在动态定价下更好地整合可再生能源发电，第二种与这些同一代理商如何与聚合剂一起使用，以向自私外部微电网销售能量的能量减少自己的能源票据。这项工作发现，具有分散执行的多代理深度确定性政策梯度的集中学习及其最先进的变体允许多种代理方法显着地比来自单个全局代理的控制更好。还发现，在多种子体方法中使用单独的奖励功能比使用单个控制剂更好。还发现能够与其他微电网交易，而不是卖回实用电网，也发现大大增加了网格的储蓄。

translated by 谷歌翻译

Recent Advances in Reinforcement Learning in Finance

Ben Hambly , Renyuan Xu , Huining Yang

分类：机器学习

2021-12-08

由于数据量增加，金融业的快速变化已经彻底改变了数据处理和数据分析的技术，并带来了新的理论和计算挑战。与古典随机控制理论和解决财务决策问题的其他分析方法相比，解决模型假设的财务决策问题，强化学习（RL）的新发展能够充分利用具有更少模型假设的大量财务数据并改善复杂的金融环境中的决策。该调查纸目的旨在审查最近的资金途径的发展和使用RL方法。我们介绍了马尔可夫决策过程，这是许多常用的RL方法的设置。然后引入各种算法，重点介绍不需要任何模型假设的基于价值和基于策略的方法。连接是用神经网络进行的，以扩展框架以包含深的RL算法。我们的调查通过讨论了这些RL算法在金融中各种决策问题中的应用，包括最佳执行，投资组合优化，期权定价和对冲，市场制作，智能订单路由和Robo-Awaring。

translated by 谷歌翻译

A Deep Reinforcement Learning-Based Charging Scheduling Approach with Augmented Lagrangian for Electric Vehicle

Guibin. Chen , Xiaoying. Shi

分类：人工智能 | 机器学习

2022-09-20

本文解决了当参与需求响应（DR）时优化电动汽车（EV）的充电/排放时间表的问题。由于电动汽车的剩余能量，到达和出发时间以及未来的电价中存在不确定性，因此很难做出充电决定以最大程度地减少充电成本，同时保证电动汽车的电池最先进（SOC）在内某些范围。为了解决这一难题，本文将EV充电调度问题制定为Markov决策过程（CMDP）。通过协同结合增强的Lagrangian方法和软演员评论家算法，本文提出了一种新型安全的非政策钢筋学习方法（RL）方法来解决CMDP。通过Lagrangian值函数以策略梯度方式更新Actor网络。采用双重危机网络来同步估计动作值函数，以避免高估偏差。所提出的算法不需要强烈的凸度保证，可以保证被检查的问题，并且是有效的样本。现实世界中电价的全面数值实验表明，我们提出的算法可以实现高解决方案最佳性和约束依从性。

translated by 谷歌翻译

Reinforcement Learning Based Cooperative P2P Energy Trading between DC Nanogrid Clusters with Wind and PV Energy Resources

Sangkeum Lee , Hojun Jin , Sarvar Hussain Nengroo , Taewook Heo , Yoonmee Doh , Chungho Lee , Dongsoo Har

分类：机器学习

2022-09-16

为了通过使用可再生能源来取代化石燃料，间歇性风能和光伏（PV）功率的资源不平衡是点对点（P2P）功率交易的关键问题。为了解决这个问题，本文介绍了增强学习（RL）技术。对于RL，图形卷积网络（GCN）和双向长期记忆（BI-LSTM）网络由基于合作游戏理论的纳米簇之间的P2P功率交易共同应用于P2P功率交易。柔性且可靠的DC纳米醇适合整合可再生能源以进行分配系统。每个局部纳米粒子群都采用了生产者的位置，同时着重于功率生产和消费。对于纳米级簇的电源管理，使用物联网（IoT）技术将多目标优化应用于每个本地纳米群集群。考虑到风和光伏发电的间歇性特征，进行电动汽车（EV）的充电/排放。 RL算法，例如深Q学习网络（DQN），深度复发Q学习网络（DRQN），BI-DRQN，近端策略优化（PPO），GCN-DQN，GCN-DQN，GCN-DRQN，GCN-DRQN，GCN-BI-DRQN和GCN-PPO用于模拟。因此，合作P2P电力交易系统利用使用时间（TOU）基于关税的电力成本和系统边际价格（SMP）最大化利润，并最大程度地减少电网功耗的量。用P2P电源交易的纳米簇簇的电源管理实时模拟了分配测试馈线，并提议的GCN-PPO技术将纳米糖簇的电量降低了36.7％。

translated by 谷歌翻译

A Reinforcement Learning Approach for the Continuous Electricity Market of Germany: Trading from the Perspective of a Wind Park Operator

Malte Lehna , Björn Hoppmann , René Heinrich , Christoph Scholz

分类：机器学习

2021-11-26

随着可再生能源的延伸升幅，盘中电市场在交易商和电力公用事业中录得不断增长的普及，以应对能源供应的诱导波动。通过其短途交易地平线和持续的性质，盘中市场提供了调整日前市场的交易决策的能力，或者在短期通知中降低交易风险。通过根据当前预测修改其提供的能力，可再生能源的生产者利用盘中市场降低预测风险。然而，由于电网必须保持稳定，电力仅部分可存储，因此市场动态很复杂。因此，需要在盘区市场中运营的强大和智能交易策略。在这项工作中，我们提出了一种基于深度加强学习（DRL）算法的新型自主交易方法作为可能的解决方案。为此目的，我们将盘区贸易塑造为马尔可夫决策问题（MDP），并采用近端策略优化（PPO）算法作为我们的DRL方法。介绍了一种模拟框架，使得连续盘整价格的分辨率提供一分钟步骤。从风园运营商的角度来看，我们在案例研究中测试我们的框架。我们在普通贸易信息旁边包括价格和风险预测。在2018年德国盘区交易结果的测试场景中，我们能够以至少45.24％的改进优于多个基线，显示DRL算法的优势。但是，我们还讨论了DRL代理的局限性和增强功能，以便在未来的工作中提高性能。

translated by 谷歌翻译

RARE: Renewable Energy Aware Resource Management in Datacenters

Vanamala Venkataswamy , Jake Grigsby , Andrew Grimshaw , Yanjun Qi

分类：人工智能

2022-11-10

The exponential growth in demand for digital services drives massive datacenter energy consumption and negative environmental impacts. Promoting sustainable solutions to pressing energy and digital infrastructure challenges is crucial. Several hyperscale cloud providers have announced plans to power their datacenters using renewable energy. However, integrating renewables to power the datacenters is challenging because the power generation is intermittent, necessitating approaches to tackle power supply variability. Hand engineering domain-specific heuristics-based schedulers to meet specific objective functions in such complex dynamic green datacenter environments is time-consuming, expensive, and requires extensive tuning by domain experts. The green datacenters need smart systems and system software to employ multiple renewable energy sources (wind and solar) by intelligently adapting computing to renewable energy generation. We present RARE (Renewable energy Aware REsource management), a Deep Reinforcement Learning (DRL) job scheduler that automatically learns effective job scheduling policies while continually adapting to datacenters' complex dynamic environment. The resulting DRL scheduler performs better than heuristic scheduling policies with different workloads and adapts to the intermittent power supply from renewables. We demonstrate DRL scheduler system design parameters that, when tuned correctly, produce better performance. Finally, we demonstrate that the DRL scheduler can learn from and improve upon existing heuristic policies using Offline Learning.

translated by 谷歌翻译

Model-Free Reinforcement Learning for Asset Allocation

Adebayo Oshingbesan , Eniola Ajiboye , Peruth Kamashazi , Timothy Mbaka

分类：机器学习

2022-09-21

资产分配（或投资组合管理）是确定如何最佳将有限预算的资金分配给一系列金融工具/资产（例如股票）的任务。这项研究调查了使用无模型的深RL代理应用于投资组合管理的增强学习（RL）的性能。我们培训了几个RL代理商的现实股票价格，以学习如何执行资产分配。我们比较了这些RL剂与某些基线剂的性能。我们还比较了RL代理，以了解哪些类别的代理表现更好。从我们的分析中，RL代理可以执行投资组合管理的任务，因为它们的表现明显优于基线代理（随机分配和均匀分配）。四个RL代理（A2C，SAC，PPO和TRPO）总体上优于最佳基线MPT。这显示了RL代理商发现更有利可图的交易策略的能力。此外，基于价值和基于策略的RL代理之间没有显着的性能差异。演员批评者的表现比其他类型的药物更好。同样，在政策代理商方面的表现要好，因为它们在政策评估方面更好，样品效率在投资组合管理中并不是一个重大问题。这项研究表明，RL代理可以大大改善资产分配，因为它们的表现优于强基础。基于我们的分析，在政策上，参与者批评的RL药物显示出最大的希望。

translated by 谷歌翻译

Performance Comparison of Deep RL Algorithms for Energy Systems Optimal Scheduling

Hou Shengren , Edgar Mauricio Salazar , Pedro P. Vergara , Peter Palensky

分类：机器学习

2022-08-01

利用其数据驱动和无模型的功能，深入加强学习（DRL）算法有可能应对由于引入基于可再生能源的一代而导致的不确定性升高。要同时处理能源系统的运营成本和技术约束（例如，生成需求平衡），DRL算法在设计奖励功能时必须考虑权衡取舍。这种权衡引入了额外的超参数，这些超参数会影响DRL算法的性能和提供可行解决方案的能力。在本文中，介绍了包括DDPG，TD3，SAC和PPO在内的不同DRL算法的性能比较。我们旨在为能源系统最佳调度问题提供这些DRL算法的公平比较。结果表明，与能源系统最佳调度问题的数学编程模型相比，即使在看不见的操作场景中，DRL算法在实时良好质量解决方案中提供的能力也是如此。然而，在大量高峰消费的情况下，这些算法未能提供可行的解决方案，这可能会阻碍其实际实施。

translated by 谷歌翻译

Optimal scheduling of island integrated energy systems considering multi-uncertainties and hydrothermal simultaneous transmission: A deep reinforcement learning approach

Yang Li , Fanjin Bu , Yuanzheng Li , Chao Long

分类：机器学习

2022-12-27

Multi-uncertainties from power sources and loads have brought significant challenges to the stable demand supply of various resources at islands. To address these challenges, a comprehensive scheduling framework is proposed by introducing a model-free deep reinforcement learning (DRL) approach based on modeling an island integrated energy system (IES). In response to the shortage of freshwater on islands, in addition to the introduction of seawater desalination systems, a transmission structure of "hydrothermal simultaneous transmission" (HST) is proposed. The essence of the IES scheduling problem is the optimal combination of each unit's output, which is a typical timing control problem and conforms to the Markov decision-making solution framework of deep reinforcement learning. Deep reinforcement learning adapts to various changes and timely adjusts strategies through the interaction of agents and the environment, avoiding complicated modeling and prediction of multi-uncertainties. The simulation results show that the proposed scheduling framework properly handles multi-uncertainties from power sources and loads, achieves a stable demand supply for various resources, and has better performance than other real-time scheduling methods, especially in terms of computational efficiency. In addition, the HST model constitutes an active exploration to improve the utilization efficiency of island freshwater.

translated by 谷歌翻译

Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management

Yuanzheng Li , Shangyang He , Yang Li , Yang Shi , Zhigang Zeng

分类：机器学习

2022-12-29

The utilization of large-scale distributed renewable energy promotes the development of the multi-microgrid (MMG), which raises the need of developing an effective energy management method to minimize economic costs and keep self energy-sufficiency. The multi-agent deep reinforcement learning (MADRL) has been widely used for the energy management problem because of its real-time scheduling ability. However, its training requires massive energy operation data of microgrids (MGs), while gathering these data from different MGs would threaten their privacy and data security. Therefore, this paper tackles this practical yet challenging issue by proposing a federated multi-agent deep reinforcement learning (F-MADRL) algorithm via the physics-informed reward. In this algorithm, the federated learning (FL) mechanism is introduced to train the F-MADRL algorithm thus ensures the privacy and the security of data. In addition, a decentralized MMG model is built, and the energy of each participated MG is managed by an agent, which aims to minimize economic costs and keep self energy-sufficiency according to the physics-informed reward. At first, MGs individually execute the self-training based on local energy operation data to train their local agent models. Then, these local models are periodically uploaded to a server and their parameters are aggregated to build a global agent, which will be broadcasted to MGs and replace their local agents. In this way, the experience of each MG agent can be shared and the energy operation data is not explicitly transmitted, thus protecting the privacy and ensuring data security. Finally, experiments are conducted on Oak Ridge national laboratory distributed energy control communication lab microgrid (ORNL-MG) test system, and the comparisons are carried out to verify the effectiveness of introducing the FL mechanism and the outperformance of our proposed F-MADRL.

translated by 谷歌翻译

Progress and summary of reinforcement learning on energy management of MPS-EV

Jincheng Hu , Yang Lin , Liang Chu , Zhuoran Hou , Jihan Li , Jingjing Jiang , Yuanjian Zhang

分类：机器学习

2022-11-08

The high emission and low energy efficiency caused by internal combustion engines (ICE) have become unacceptable under environmental regulations and the energy crisis. As a promising alternative solution, multi-power source electric vehicles (MPS-EVs) introduce different clean energy systems to improve powertrain efficiency. The energy management strategy (EMS) is a critical technology for MPS-EVs to maximize efficiency, fuel economy, and range. Reinforcement learning (RL) has become an effective methodology for the development of EMS. RL has received continuous attention and research, but there is still a lack of systematic analysis of the design elements of RL-based EMS. To this end, this paper presents an in-depth analysis of the current research on RL-based EMS (RL-EMS) and summarizes the design elements of RL-based EMS. This paper first summarizes the previous applications of RL in EMS from five aspects: algorithm, perception scheme, decision scheme, reward function, and innovative training method. The contribution of advanced algorithms to the training effect is shown, the perception and control schemes in the literature are analyzed in detail, different reward function settings are classified, and innovative training methods with their roles are elaborated. Finally, by comparing the development routes of RL and RL-EMS, this paper identifies the gap between advanced RL solutions and existing RL-EMS. Finally, this paper suggests potential development directions for implementing advanced artificial intelligence (AI) solutions in EMS.

translated by 谷歌翻译