In this paper, we place deep Q-learning into a control-oriented perspective and study its learning dynamics with well-established techniques from robust control. We formulate an uncertain linear time-invariant model by means of the neural tangent kernel to describe learning. We show the instability of learning and analyze the agent's behavior in frequency-domain. Then, we ensure convergence via robust controllers acting as dynamical rewards in the loss function. We synthesize three controllers: state-feedback gain scheduling H2, dynamic Hinf, and constant gain Hinf controllers. Setting up the learning agent with a control-oriented tuning methodology is more transparent and has well-established literature compared to the heuristics in reinforcement learning. In addition, our approach does not use a target network and randomized replay memory. The role of the target network is overtaken by the control input, which also exploits the temporal dependency of samples (opposed to a randomized memory buffer). Numerical simulations in different OpenAI Gym environments suggest that the Hinf controlled learning performs slightly better than Double deep Q-learning.
translated by 谷歌翻译
制药行业可以更好地利用其数据资产来通过协作机器学习平台虚拟化药物发现。另一方面,由于参与者的培训数据的意外泄漏,存在不可忽略的风险,因此,对于这样的平台,必须安全和隐私权。本文介绍了在药物发现的临床前阶段进行协作建模的隐私风险评估,以加快有前途的候选药物的选择。在最新推理攻击的简短分类法之后,我们采用并定制了几种基础情况。最后,我们用一些相关的隐私保护技术来描述和实验,以减轻此类攻击。
translated by 谷歌翻译
在本文中,我们引入用于计算各种3D数据集之间的中间表面的新方法,例如点云和/或离散表面。传统上,通过检测计算距离功能的奇点等中间表面来获得,例如脊,三个结等。它需要计算二阶差分特性,并且还必须应用某些种类的启发式。与此相反,我们确定中表面只是计算距离功能本身,这是一种快速简单的方法。我们存在并比较快速扫描方法,矢量距离变换算法,快速行进方法和Dijkstra-Pythagoras方法在查找3D数据集之间的中间表面时的结果。
translated by 谷歌翻译