The connectivity-aware path design is crucial in the effective deployment of autonomous Unmanned Aerial Vehicles (UAVs). Recently, Reinforcement Learning (RL) algorithms have become the popular approach to solving this type of complex problem, but RL algorithms suffer slow convergence. In this paper, we propose a Transfer Learning (TL) approach, where we use a teacher policy previously trained in an old domain to boost the path learning of the agent in the new domain. As the exploration processes and the training continue, the agent refines the path design in the new domain based on the subsequent interactions with the environment. We evaluate our approach considering an old domain at sub-6 GHz and a new domain at millimeter Wave (mmWave). The teacher path policy, previously trained at sub-6 GHz path, is the solution to a connectivity-aware path problem that we formulate as a constrained Markov Decision Process (CMDP). We employ a Lyapunov-based model-free Deep Q-Network (DQN) to solve the path design at sub-6 GHz that guarantees connectivity constraint satisfaction. We empirically demonstrate the effectiveness of our approach for different urban environment scenarios. The results demonstrate that our proposed approach is capable of reducing the training time considerably at mmWave.
translated by 谷歌翻译