我们制定了一类由物理驱动的深层变量模型(PDDLVM),以学习参数偏微分方程(PDES)的参数到解决方案(正向)和解决方案到参数(逆)图。我们的公式利用有限元方法(FEM),深神经网络和概率建模来组装一个深层概率框架,在该框架中,向前和逆图通过连贯的不确定性量化近似。我们的概率模型明确合并了基于参数PDE的密度和可训练的解决方案到参数网络,而引入的摊销变异家庭假定参数到解决方案网络,所有这些网络均经过联合培训。此外,所提出的方法不需要任何昂贵的PDE解决方案,并且仅在训练时间内对物理信息进行了信息,该方法允许PDE的实时仿真和培训后的逆问题解决方案的产生,绕开了对FEM操作的需求,以相当的准确性,以便于FEM解决方案。提出的框架进一步允许无缝集成观察到的数据,以解决反问题和构建生成模型。我们证明了方法对非线性泊松问题,具有复杂3D几何形状的弹性壳以及整合通用物理信息信息的神经网络(PINN)体系结构的有效性。与传统的FEM求解器相比,训练后,我们最多达到了三个数量级的速度,同时输出连贯的不确定性估计值。
translated by 谷歌翻译
我们分析了优化的自适应重视采样器(OAI),以便与一般提案进行Monte Carlo集成。我们利用了一个经典的结果,该结果表明,具有$ \ chi ^ 2 $的重要性采样量表的偏差和平均平方误差(MSE) - 在目标和提案之间以及开发一个执行全球优化$的方案\ chi ^ 2 $ - 程度。虽然众所周知,这一数量是指数家庭建议的凸,但一般提案的情况一直是一个公开问题。我们利用随机梯度Langevin动态(SGLD)及其已被推迟的对应于$ \ Chi ^ 2 $的全球优化的损失对手,通过利用非凸优化文献的最近结果来实现MSE的,并为MSE获得巨大的界限。由此产生的AIS方案在迭代次数中具有明确的理论保证。
translated by 谷歌翻译
最近的统计有限元方法(STATFEM)提供了一种相干统计框架,用于用观察到的数据合成有限元模型。通过嵌入控制方程内的不确定性,更新有限元解决方案以提供后部分布,该分布量化与模型相关的所有不确定性源。然而,为了纳入所有不确定性来源,必须整合与模型参数相关的不确定性,该不确定量的已知前向问题。在本文中,我们利用Langevin动力学来解决统计信息前进问题,研究了不调整的Langevin算法(ULA)的效用,是一种无马达罗夫的马尔可夫链蒙特卡罗采样器,以构建基于样品的特征,否则难以置化措施。由于STATFEM问题的结构,这些方法能够解决不明确的全PDE解决的前向问题,只需要稀疏的矩阵矢量产品。 ULA也是基于梯度的,因此提供了可扩展的方法,达到了高度自由度。利用基于Langevin的采样器背后的理论,我们提供了对采样器性能的理论保证,展示了在克洛拉 - 莱布勒分歧的先前和后后的收敛性,以及在Wassersein-2中,进一步得到了预处理的影响。对于先前和后部,还提供了数值实验,以证明采样器的功效,并且还包括Python封装。
translated by 谷歌翻译
我们介绍和分析并分析并行蒙特蒙特卡罗方法,了解优化问题的数值解决方案,涉及最小化成本函数,该功能包括许多单独组件的总和。该方案是一种随机零顺序优化算法,只需要评估成本函数的小组集的能力。它可以描绘为一组采样器,可以产生几个概率措施序列的粒子近似。这些措施是以一种方式构建的,使得它们具有相关的概率密度函数,其全球最大值与原始成本函数的全局最小值相一致。该算法选择最佳的执行采样器,并使用它来近似于成本函数的全局最小值。我们在分析上证明了所得估计器几乎肯定地将成本函数的全局最小值收敛并提供了产生的蒙特卡罗样本的数量和搜索空间的维度的显性收敛速率。我们通过数值示例显示该算法可以用多个最小值或具有宽的“平坦”区域来解决成本函数,这很难使用基于梯度的技术最小化。
translated by 谷歌翻译
This study focuses on improving the optical character recognition (OCR) data for panels in the COMICS dataset, the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recognition". We evaluated the performance of state-of-the-art text detection and recognition models on these datasets and found significant improvement in word accuracy and normalized edit distance compared to the text in COMICS. We also created a new dataset called "COMICS Text+", which contains the extracted text from the textboxes in the COMICS dataset. Using the improved text data of COMICS Text+ in the comics processing model from resulted in state-of-the-art performance on cloze-style tasks without changing the model architecture. The COMICS Text+ dataset can be a valuable resource for researchers working on tasks including text detection, recognition, and high-level processing of comics, such as narrative understanding, character relations, and story generation. All the data and inference instructions can be accessed in https://github.com/gsoykan/comics_text_plus.
translated by 谷歌翻译
Diffractive optical networks provide rich opportunities for visual computing tasks since the spatial information of a scene can be directly accessed by a diffractive processor without requiring any digital pre-processing steps. Here we present data class-specific transformations all-optically performed between the input and output fields-of-view (FOVs) of a diffractive network. The visual information of the objects is encoded into the amplitude (A), phase (P), or intensity (I) of the optical field at the input, which is all-optically processed by a data class-specific diffractive network. At the output, an image sensor-array directly measures the transformed patterns, all-optically encrypted using the transformation matrices pre-assigned to different data classes, i.e., a separate matrix for each data class. The original input images can be recovered by applying the correct decryption key (the inverse transformation) corresponding to the matching data class, while applying any other key will lead to loss of information. The class-specificity of these all-optical diffractive transformations creates opportunities where different keys can be distributed to different users; each user can only decode the acquired images of only one data class, serving multiple users in an all-optically encrypted manner. We numerically demonstrated all-optical class-specific transformations covering A-->A, I-->I, and P-->I transformations using various image datasets. We also experimentally validated the feasibility of this framework by fabricating a class-specific I-->I transformation diffractive network using two-photon polymerization and successfully tested it at 1550 nm wavelength. Data class-specific all-optical transformations provide a fast and energy-efficient method for image and data encryption, enhancing data security and privacy.
translated by 谷歌翻译
Recent advances in distributed artificial intelligence (AI) have led to tremendous breakthroughs in various communication services, from fault-tolerant factory automation to smart cities. When distributed learning is run over a set of wirelessly connected devices, random channel fluctuations and the incumbent services running on the same network impact the performance of both distributed learning and the coexisting service. In this paper, we investigate a mixed service scenario where distributed AI workflow and ultra-reliable low latency communication (URLLC) services run concurrently over a network. Consequently, we propose a risk sensitivity-based formulation for device selection to minimize the AI training delays during its convergence period while ensuring that the operational requirements of the URLLC service are met. To address this challenging coexistence problem, we transform it into a deep reinforcement learning problem and address it via a framework based on soft actor-critic algorithm. We evaluate our solution with a realistic and 3GPP-compliant simulator for factory automation use cases. Our simulation results confirm that our solution can significantly decrease the training delay of the distributed AI service while keeping the URLLC availability above its required threshold and close to the scenario where URLLC solely consumes all network resources.
translated by 谷歌翻译
Multispectral imaging has been used for numerous applications in e.g., environmental monitoring, aerospace, defense, and biomedicine. Here, we present a diffractive optical network-based multispectral imaging system trained using deep learning to create a virtual spectral filter array at the output image field-of-view. This diffractive multispectral imager performs spatially-coherent imaging over a large spectrum, and at the same time, routes a pre-determined set of spectral channels onto an array of pixels at the output plane, converting a monochrome focal plane array or image sensor into a multispectral imaging device without any spectral filters or image recovery algorithms. Furthermore, the spectral responsivity of this diffractive multispectral imager is not sensitive to input polarization states. Through numerical simulations, we present different diffractive network designs that achieve snapshot multispectral imaging with 4, 9 and 16 unique spectral bands within the visible spectrum, based on passive spatially-structured diffractive surfaces, with a compact design that axially spans ~72 times the mean wavelength of the spectral band of interest. Moreover, we experimentally demonstrate a diffractive multispectral imager based on a 3D-printed diffractive network that creates at its output image plane a spatially-repeating virtual spectral filter array with 2x2=4 unique bands at terahertz spectrum. Due to their compact form factor and computation-free, power-efficient and polarization-insensitive forward operation, diffractive multispectral imagers can be transformative for various imaging and sensing applications and be used at different parts of the electromagnetic spectrum where high-density and wide-area multispectral pixel arrays are not widely available.
translated by 谷歌翻译
Privacy-preserving inference via edge or encrypted computing paradigms encourages users of machine learning services to confidentially run a model on their personal data for a target task and only share the model's outputs with the service provider; e.g., to activate further services. Nevertheless, despite all confidentiality efforts, we show that a ''vicious'' service provider can approximately reconstruct its users' personal data by observing only the model's outputs, while keeping the target utility of the model very close to that of a ''honest'' service provider. We show the possibility of jointly training a target model (to be run at users' side) and an attack model for data reconstruction (to be secretly used at server's side). We introduce the ''reconstruction risk'': a new measure for assessing the quality of reconstructed data that better captures the privacy risk of such attacks. Experimental results on 6 benchmark datasets show that for low-complexity data types, or for tasks with larger number of classes, a user's personal data can be approximately reconstructed from the outputs of a single target inference task. We propose a potential defense mechanism that helps to distinguish vicious vs. honest classifiers at inference time. We conclude this paper by discussing current challenges and open directions for future studies. We open-source our code and results, as a benchmark for future work.
translated by 谷歌翻译
Federated learning (FL) is a promising approach to enable the future Internet of vehicles consisting of intelligent connected vehicles (ICVs) with powerful sensing, computing and communication capabilities. We consider a base station (BS) coordinating nearby ICVs to train a neural network in a collaborative yet distributed manner, in order to limit data traffic and privacy leakage. However, due to the mobility of vehicles, the connections between the BS and ICVs are short-lived, which affects the resource utilization of ICVs, and thus, the convergence speed of the training process. In this paper, we propose an accelerated FL-ICV framework, by optimizing the duration of each training round and the number of local iterations, for better convergence performance of FL. We propose a mobility-aware optimization algorithm called MOB-FL, which aims at maximizing the resource utilization of ICVs under short-lived wireless connections, so as to increase the convergence speed. Simulation results based on the beam selection and the trajectory prediction tasks verify the effectiveness of the proposed solution.
translated by 谷歌翻译