尽管固定环境中的单一机构政策优化最近在增强学习社区中引起了很多研究的关注,但是当在潜在竞争性的环境中有多个代理商在玩耍时,从理论上讲,少得多。我们通过提出和分析具有结构化但未知过渡的零和Markov游戏的新的虚拟游戏策略优化算法来向前迈进。我们考虑两类的过渡结构:分类的独立过渡和单个控制器过渡。对于这两种情况,我们都证明了紧密的$ \ widetilde {\ Mathcal {o}}(\ sqrt {k})$遗憾的范围在$ k $ eviepodes之后,在两种代理竞争的游戏场景中。每个代理人的遗憾是针对潜在的对抗对手的衡量,他们在观察完整的政策序列后可以在事后选择一个最佳政策。我们的算法在非平稳环境中同时进行政策优化的范围下,具有上置信度结合(UCB)的乐观和虚拟游戏的结合。当两个玩家都采用所提出的算法时,他们的总体最优差距为$ \ widetilde {\ Mathcal {o}}(\ sqrt {k})$。
translated by 谷歌翻译
图形神经网络(GNN)由于从图形结构数据中学习表示能力而引起了很多关注。尽管GNN在许多域中成功地应用了,但GNN的优化程度较低,并且在节点分类的性能很大程度上受到了长尾节点学位分布的影响。本文着重于通过归一化提高GNN的性能。详细说明,通过研究图中的节点度的长尾巴分布,我们提出了一种新颖的GNN归一化方法,该方法称为RESNORM(\ textbf {res}将长尾巴分布纳入正常分布,通过\ textbf {norm} alization)。 RESNOR的$比例$操作重塑节点标准偏差(NSTD)分布,以提高尾部节点的准确性(\ textit {i}。\ textit {e}。,低度节点)。我们提供了理论解释和经验证据,以理解上述$ scale $的机制。除了长期的分销问题外,过度光滑也是困扰社区的基本问题。为此,我们分析了标准偏移的行为,并证明了标准移位是重量矩阵上的预处理,从而增加了过度平滑的风险。考虑到过度光滑的问题,我们为Resnorm设计了一个$ Shift $操作,以低成本的方式模拟了特定于学位的参数策略。广泛的实验验证了重新分类对几个节点分类基准数据集的有效性。
translated by 谷歌翻译
随机梯度下降(SGD)是现代机器学习(ML)系统的基石。尽管具有其计算效率,但SGD仍需要随机数据访问,这些数据访问在依赖块可调地理的二级存储的系统中实现效率低下,例如HDD和SSD,例如TensorFlow/Pytorch和DB ML系统,而不是大文件。为了解决这种阻抗不匹配,已经提出了各种数据改组策略,以平衡SGD的收敛速率(有利于随机性)及其I/O性能(有利于顺序访问)。在本文中,我们首先对现有数据改组策略进行系统的实证研究,该研究表明,所有现有策略都有改进的空间 - 它们都在I/O性能或融合率方面受苦。考虑到这一点,我们提出了一种简单但新颖的分层数据改组策略Corgipile。与现有的策略相比,Corgipile避免了完整的数据洗牌,同时保持SGD的可比收敛速度,就好像执行了完整的混音一样。我们对Corgipile的融合行为提供了非平凡的理论分析。我们通过在新的CorgipileDataSet API中设计新的平行/分布式洗牌操作员来进一步将Corgipile整合到Pytorch中。我们还通过介绍具有优化的三个新的物理运营商,将Corgipile集成到PostgreSQL中。我们的实验结果表明,Corgipile可以与全面的SGD达到可比的收敛速率,以实现深度学习和广义线性模型。对于ImageNet数据集的深度学习模型,Corgipile比带有完整数据洗牌的Pytorch快1.5倍。对于具有线性模型的INDB ML,在HDD和SSD上,Corgipile的Corgipile比两个最先进的IN-DB ML系统(Apache Madlib和Bismarck)快1.6 x-12.8倍。
translated by 谷歌翻译
基于A/B测试的政策评估引起了人们对数字营销的极大兴趣,但是在乘车平台(例如Uber和Didi)中的这种评估主要是由于其时间和/或空间依赖性实验的复杂结构而被很好地研究。 。本文的目的是在乘车平台中的政策评估中进行,目的是在平台的政策和换回设计下的感兴趣结果之间建立因果关系。我们提出了一个基于时间变化系数决策过程(VCDP)模型的新型潜在结果框架,以捕获时间依赖性实验中的动态治疗效果。我们通过将其分解为直接效应总和(DE)和间接效应(IE)来进一步表征平均治疗效应。我们为DE和IE制定了估计和推理程序。此外,我们提出了一个时空VCDP来处理时空依赖性实验。对于这两个VCDP模型,我们都建立了估计和推理程序的统计特性(例如弱收敛和渐近力)。我们进行广泛的模拟,以研究拟议估计和推理程序的有限样本性能。我们研究了VCDP模型如何帮助改善DIDI中各种派遣和处置政策的政策评估。
translated by 谷歌翻译
在本文中,我们介绍了有关典型乘车共享系统中决策优化问题的强化学习方法的全面,深入的调查。涵盖了有关乘车匹配,车辆重新定位,乘车,路由和动态定价主题的论文。在过去的几年中,大多数文献都出现了,并且要继续解决一些核心挑战:模型复杂性,代理协调和多个杠杆的联合优化。因此,我们还引入了流行的数据集和开放式仿真环境,以促进进一步的研发。随后,我们讨论了有关该重要领域的强化学习研究的许多挑战和机会。
translated by 谷歌翻译
A / B测试或在线实验是一种标准的业务策略,可以在制药,技术和传统行业中与旧产品进行比较。在双面市场平台(例如优步)的在线实验中出现了主要挑战,其中只有一个单位接受一系列处理随着时间的推移。在这些实验中,给定时间的治疗会影响当前结果以及未来的结果。本文的目的是引入用于在这些实验中携带A / B测试的加强学习框架,同时表征长期治疗效果。我们所提出的测试程序允许顺序监控和在线更新。它通常适用于不同行业的各种治疗设计。此外,我们系统地研究了我们测试程序的理论特性(例如,尺寸和功率)。最后,我们将框架应用于模拟数据和从技术公司获得的真实数据示例,以说明其在目前的实践中的优势。我们的测试的Python实现是在https://github.com/callmespring/causalrl上找到的。
translated by 谷歌翻译
Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
translated by 谷歌翻译
For Prognostics and Health Management (PHM) of Lithium-ion (Li-ion) batteries, many models have been established to characterize their degradation process. The existing empirical or physical models can reveal important information regarding the degradation dynamics. However, there is no general and flexible methods to fuse the information represented by those models. Physics-Informed Neural Network (PINN) is an efficient tool to fuse empirical or physical dynamic models with data-driven models. To take full advantage of various information sources, we propose a model fusion scheme based on PINN. It is implemented by developing a semi-empirical semi-physical Partial Differential Equation (PDE) to model the degradation dynamics of Li-ion-batteries. When there is little prior knowledge about the dynamics, we leverage the data-driven Deep Hidden Physics Model (DeepHPM) to discover the underlying governing dynamic models. The uncovered dynamics information is then fused with that mined by the surrogate neural network in the PINN framework. Moreover, an uncertainty-based adaptive weighting method is employed to balance the multiple learning tasks when training the PINN. The proposed methods are verified on a public dataset of Li-ion Phosphate (LFP)/graphite batteries.
translated by 谷歌翻译
Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of which is of high possibility to be degraded due to noises and distortions. In this paper, we propose two novel NLOS reconstruction models based on curvature regularization, i.e., the object-domain curvature regularization model and the dual (i.e., signal and object)-domain curvature regularization model. Fast numerical optimization algorithms are developed relying on the alternating direction method of multipliers (ADMM) with the backtracking stepsize rule, which are further accelerated by GPU implementation. We evaluate the proposed algorithms on both synthetic and real datasets, which achieve state-of-the-art performance, especially in the compressed sensing setting. All our codes and data are available at https://github.com/Duanlab123/CurvNLOS.
translated by 谷歌翻译
Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (DM), we sequentially sample multiple masked views per image in a mini-batch with the disjoint regulation to raise the usage of tokens for reconstruction in each image while keeping the masking rate of each view. For joint distillation (JD), we adopt a dual branch architecture to respectively predict invisible (masked) and visible (unmasked) tokens with superior learning targets. Rooting in orthogonal perspectives for training efficiency improvement, DM and JD cooperatively accelerate the training convergence yet not sacrificing the model generalization ability. Concretely, DM can train ViT with half of the effective training epochs (3.7 times less time-consuming) to report competitive performance. With JD, our DMJD clearly improves the linear probing classification accuracy over ConvMAE by 5.8%. On fine-grained downstream tasks like semantic segmentation, object detection, etc., our DMJD also presents superior generalization compared with state-of-the-art SSL methods. The code and model will be made public at https://github.com/mx-mark/DMJD.
translated by 谷歌翻译