智能论文笔记

Fast Auto-Differentiable Digitally Reconstructed Radiographs for Solving Inverse Problems in Intraoperative Imaging

Vivek Gopalakrishnan , Polina Golland

分类：计算机视觉

2022-08-26

在术前设置中，使用了数字重建的X光片（DRR）来解决诸如切片到体积注册和3D重建之类的反问题。在术中成像中，DRR的实用性受到实时生成它们的挑战的限制，并支持依赖重复的DRR合成的优化程序。尽管通过算法改进和GPU实现加速了DRR的生成，但基于DRR的优化仍然很慢，因为大多数DRR发电机没有提供有关成像参数的梯度的直接方法。为了使DRR与基于梯度的优化和深度学习框架互操作，我们重新重新制定了Siddon的方法，Siddon的方法是DRR生成中使用的最流行的射线追踪算法，作为一系列矢量化的张量操作。我们在Pytorch中实现了Siddon方法的矢量化版本，利用了图书馆的强大自动分化引擎，使该DRR发电机相对于其参数完全可区分。此外，使用GPU加速张量计算使我们的矢量实现能够实现与CUDA和C ++实现的最新DRR发电机相同的渲染速度。我们在切片到体积注册的上下文中说明了所得的方法。此外，我们的模拟表明，在最佳解决方案附近，切片到体积注册问题的损失景观是凸的，基于梯度的注册有望比普遍的无梯度优化策略更快。提出的DRR发电机使快速的计算机视觉算法能够在微创过程中支持图像指导。我们的实施公开可在https://github.com/v715/diffdrr上获得。

translated by 谷歌翻译

HTML版本

Gradient-Based Geometry Learning for Fan-Beam CT Reconstruction

Mareike Thies , Fabian Wagner , Noah Maul , Lukas Folle , Manuela Meier , Maximilian Rohleder , Linda-Sophie Schneider , Laura Pfaff , Mingxuan Gu , Jonas Utz

分类：计算机视觉

2022-12-05

Incorporating computed tomography (CT) reconstruction operators into differentiable pipelines has proven beneficial in many applications. Such approaches usually focus on the projection data and keep the acquisition geometry fixed. However, precise knowledge of the acquisition geometry is essential for high quality reconstruction results. In this paper, the differentiable formulation of fan-beam CT reconstruction is extended to the acquisition geometry. This allows to propagate gradient information from a loss function on the reconstructed image into the geometry parameters. As a proof-of-concept experiment, this idea is applied to rigid motion compensation. The cost function is parameterized by a trained neural network which regresses an image quality metric from the motion affected reconstruction alone. Using the proposed method, we are the first to optimize such an autofocus-inspired algorithm based on analytical gradients. The algorithm achieves a reduction in MSE by 35.5 % and an improvement in SSIM by 12.6 % over the motion affected reconstruction. Next to motion compensation, we see further use cases of our differentiable method for scanner calibration or hybrid techniques employing deep models.

translated by 谷歌翻译

Projection-Domain Self-Supervision for Volumetric Helical CT Reconstruction

Onni Kosomaa , Samuli Laine , Tero Karras , Miika Aittala , Jaakko Lehtinen

分类：计算机视觉 | 机器学习 | 神经与进化计算

2022-12-14

We propose a deep learning method for three-dimensional reconstruction in low-dose helical cone-beam computed tomography. We reconstruct the volume directly, i.e., not from 2D slices, guaranteeing consistency along all axes. In a crucial step beyond prior work, we train our model in a self-supervised manner in the projection domain using noisy 2D projection data, without relying on 3D reference data or the output of a reference reconstruction method. This means the fidelity of our results is not limited by the quality and availability of such data. We evaluate our method on real helical cone-beam projections and simulated phantoms. Our reconstructions are sharper and less noisy than those of previous methods, and several decibels better in quantitative PSNR measurements. When applied to full-dose data, our method produces high-quality results orders of magnitude faster than iterative techniques.

translated by 谷歌翻译

2D/3D Deep Image Registration by Learning 3D Displacement Fields for Abdominal Organs

Ryuto Miura , Megumi Nakao , Mitsuhiro Nakamura , Tetsuya Matsuda

分类：计算机视觉

2022-12-11

Deformable registration of two-dimensional/three-dimensional (2D/3D) images of abdominal organs is a complicated task because the abdominal organs deform significantly and their contours are not detected in two-dimensional X-ray images. We propose a supervised deep learning framework that achieves 2D/3D deformable image registration between 3D volumes and single-viewpoint 2D projected images. The proposed method learns the translation from the target 2D projection images and the initial 3D volume to 3D displacement fields. In experiments, we registered 3D-computed tomography (CT) volumes to digitally reconstructed radiographs generated from abdominal 4D-CT volumes. For validation, we used 4D-CT volumes of 35 cases and confirmed that the 3D-CT volumes reflecting the nonlinear and local respiratory organ displacement were reconstructed. The proposed method demonstrate the compatible performance to the conventional methods with a dice similarity coefficient of 91.6 \% for the liver region and 85.9 \% for the stomach region, while estimating a significantly more accurate CT values.

translated by 谷歌翻译

A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction

Wei Wang , Xiang-Gen Xia , Chuanjiang He , Zemin Ren , Jian Lu

分类：计算机视觉

2022-01-07

在本文中，我们提出了一种新的GPU实现了螺旋CT重建的Katsevich算法。我们的实现划分了宿函数，并通过音高来重建CT图像间距。通过利用katsevich算法参数的周期性属性，我们的方法只需要为所有音高计算这些参数一次，因此GPU-Memory负担较低，非常适合深度学习。通过将我们的实现嵌入到网络中，我们提出了一种具有稀疏探测器的高音高螺旋CT重建的端到端深网络。由于我们的网络利用了来自SINOGAGAMS和CT图像中提取的特征，因此它可以同时减少由SINOGRAMS的稀疏性引起的条纹伪像，并在CT图像中保持细节。实验表明，我们的网络在主观和客观评估中表明了相关方法。

translated by 谷歌翻译

Neural Fields in Visual Computing and Beyond

Yiheng Xie , Towaki Takikawa , Shunsuke Saito , Or Litany , Shiqin Yan , Numair Khan , Federico Tombari , James Tompkin , Vincent Sitzmann , Srinath Sridhar

分类：计算机视觉 | 机器学习

2021-11-22

机器学习的最近进步已经创造了利用一类基于坐标的神经网络来解决视觉计算问题的兴趣，该基于坐标的神经网络在空间和时间跨空间和时间的场景或对象的物理属性。我们称之为神经领域的这些方法已经看到在3D形状和图像的合成中成功应用，人体的动画，3D重建和姿势估计。然而，由于在短时间内的快速进展，许多论文存在，但尚未出现全面的审查和制定问题。在本报告中，我们通过提供上下文，数学接地和对神经领域的文学进行广泛综述来解决这一限制。本报告涉及两种维度的研究。在第一部分中，我们通过识别神经字段方法的公共组件，包括不同的表示，架构，前向映射和泛化方法来专注于神经字段的技术。在第二部分中，我们专注于神经领域的应用在视觉计算中的不同问题，超越（例如，机器人，音频）。我们的评论显示了历史上和当前化身的视觉计算中已覆盖的主题的广度，展示了神经字段方法所带来的提高的质量，灵活性和能力。最后，我们展示了一个伴随着贡献本综述的生活版本，可以由社区不断更新。

translated by 谷歌翻译

Approximate Differentiable Rendering with Algebraic Surfaces

Leonid Keselman , Martial Hebert

分类：计算机视觉 | 人工智能

2022-07-21

可区分的渲染器在对象的3D表示和该对象的图像之间提供了直接的数学链接。在这项工作中，我们为紧凑的，可解释的表示形式开发了一个近似可区分的渲染器，我们称之为模糊的metaballs。我们的大约渲染器着重于通过深度图和轮廓渲染形状。它牺牲了为实用程序提供忠诚，生成快速运行时间和可用于解决视觉任务的高质量梯度信息。与基于网格的可区分渲染器相比，我们的方法具有更快的5倍，向后传球的速度快30倍。我们方法生成的深度图和轮廓图像在任何地方都平滑且定义。在我们对可区分渲染器进行姿势估计的评估时，我们表明我们的方法是唯一与经典技术相媲美的方法。在Silhouette的形状上，我们的方法仅使用梯度下降和每像素损失，而没有任何替代损失或正则化。这些重建即使在具有分割工件的自然视频序列上也很好地工作。项目页面：https：//leonidk.github.io/fuzzy-metaballs

translated by 谷歌翻译

Registration Techniques for Deformable Objects

Alireza Ahmadi

分类：计算机视觉

2021-11-07

通常，非刚性登记的问题是匹配在两个不同点拍摄的动态对象的两个不同扫描。这些扫描可以进行刚性动作和非刚性变形。由于模型的新部分可能进入视图，而其他部件在两个扫描之间堵塞，则重叠区域是两个扫描的子集。在最常规的设置中，没有给出先前的模板形状，并且没有可用的标记或显式特征点对应关系。因此，这种情况是局部匹配问题，其考虑了随后的扫描在具有大量重叠区域的情况下进行的扫描经历的假设[28]。本文在环境中寻址的问题是同时在环境中映射变形对象和本地化摄像机。

translated by 谷歌翻译

Instant Neural Representation for Interactive Volume Rendering

Qi Wu , Michael J. Doyle , David Bauer , Kwan-Liu Ma

分类：机器学习

2022-07-23

神经网络在压缩体积数据以进行科学可视化方面表现出巨大的潜力。但是，由于训练和推断的高成本，此类体积神经表示仅应用于离线数据处理和非交互式渲染。在本文中，我们证明，通过同时利用现代的GPU张量核心，本地CUDA神经网络框架以及在线培训，我们可以使用体积神经表示来实现高性能和高效率交互式射线追踪。此外，我们的方法是完全概括的，可以适应时变的数据集。我们提出了三种用于在线培训的策略，每种策略都利用GPU，CPU和核心流程技术的不同组合。我们还开发了三个渲染实现，允许交互式射线跟踪与实时卷解码，示例流和幕后神经网络推断相结合。我们证明，我们的体积神经表示可以扩展到Terascale，以进行常规网格体积可视化，并可以轻松地支持不规则的数据结构，例如OpenVDB，非结构化，AMR和粒子体积数据。

translated by 谷歌翻译

CLAIRE -- Parallelized Diffeomorphic Image Registration for Large-Scale Biomedical Imaging Applications

Naveen Himthani , Malte Brunn , Jae-Youn Kim , Miriam Schulte , Andreas Mang , George Biros

分类：计算机视觉

2022-09-16

我们研究Claire（一种差异性多形状，多-GPU图像注册算法和软件）的性能 - 在具有数十亿素素的大规模生物医学成像应用中。在这样的分辨率下，大多数用于差异图像注册的软件包非常昂贵。结果，从业人员首先要大量删除原始图像，然后使用现有工具进行注册。我们的主要贡献是对降采样对注册性能的影响的广泛分析。我们通过将用Claire获得的全分辨率注册与合成和现实成像数据集的低分辨率注册进行比较，研究了这种影响。我们的结果表明，完全分辨率的注册可以产生卓越的注册质量 - 但并非总是如此。例如，将合成图像从$ 1024^3 $减少到$ 256^3 $将骰子系数从92％降低到79％。但是，对于嘈杂或低对比度的高分辨率图像，差异不太明显。克莱尔不仅允许我们在几秒钟内注册临床相关大小的图像，而且还可以在合理的时间内以前所未有的分辨率注册图像。考虑的最高分辨率是$ 2816 \ times3016 \ times1162 $的清晰图像。据我们所知，这是有关此类决议中图像注册质量的首次研究。

translated by 谷歌翻译

Deep Appearance Prefiltering

Steve Bako , Pradeep Sen , Anton Kaplanyan

分类：机器学习

2022-11-08

Physically based rendering of complex scenes can be prohibitively costly with a potentially unbounded and uneven distribution of complexity across the rendered image. The goal of an ideal level of detail (LoD) method is to make rendering costs independent of the 3D scene complexity, while preserving the appearance of the scene. However, current prefiltering LoD methods are limited in the appearances they can support due to their reliance of approximate models and other heuristics. We propose the first comprehensive multi-scale LoD framework for prefiltering 3D environments with complex geometry and materials (e.g., the Disney BRDF), while maintaining the appearance with respect to the ray-traced reference. Using a multi-scale hierarchy of the scene, we perform a data-driven prefiltering step to obtain an appearance phase function and directional coverage mask at each scale. At the heart of our approach is a novel neural representation that encodes this information into a compact latent form that is easy to decode inside a physically based renderer. Once a scene is baked out, our method requires no original geometry, materials, or textures at render time. We demonstrate that our approach compares favorably to state-of-the-art prefiltering methods and achieves considerable savings in memory for complex scenes.

translated by 谷歌翻译

Accurate Point Cloud Registration with Robust Optimal Transport

Zhengyang Shen , Jean Feydy , Peirong Liu , Ariel Hernán Curiale , Ruben San Jose Estepar , Raul San Jose Estepar , Marc Niethammer

分类：计算机视觉

2021-11-01

这项工作调查了鲁棒优化运输（OT）的形状匹配。具体而言，我们表明最近的OT溶解器改善了基于优化和深度学习方法的点云登记，以实惠的计算成本提高了准确性。此手稿从现代OT理论的实际概述开始。然后，我们为使用此框架进行形状匹配的主要困难提供解决方案。最后，我们展示了在广泛的具有挑战性任务上的运输增强的注册模型的性能：部分形状的刚性注册;基蒂数据集的场景流程估计;肺血管树的非参数和肺部血管树。我们基于OT的方法在准确性和可扩展性方面实现了基蒂的最先进的结果，并为挑战性的肺登记任务。我们还释放了PVT1010，这是一个新的公共数据集，1,010对肺血管树，具有密集的采样点。此数据集提供了具有高度复杂形状和变形的点云登记算法的具有挑战性用例。我们的工作表明，强大的OT可以为各种注册模型进行快速预订和微调，从而为计算机视觉工具箱提供新的键方法。我们的代码和数据集可在线提供：https：//github.com/uncbiag/robot。

translated by 谷歌翻译

Optical models for direct volume rendering

分类：

This tutorial survey paper reviews several different models for light interaction with volume densities of absorbing, glowing, reflecting, and/or scattering material. They are, in order of increasing realism, absorption only, emission only, emission and absorption combined, single scattering of external illumination without shadows, single scattering with shadows, and multiple scattering. For each model I give the physical assumptions, describe the applications for which it is appropriate, derive the differential or integral equations for light transport, present calculations methods for solving them, and show output images for a data set representing a cloud. Special attention is given to calculation methods for the multiple scattering model.

translated by 谷歌翻译

Advances in Neural Rendering

Ayush Tewari , Justus Thies , Ben Mildenhall , Pratul Srinivasan , Edgar Tretschk , Yifan Wang , Christoph Lassner , Vincent Sitzmann , Ricardo Martin-Brualla , Stephen Lombardi

分类：计算机视觉

2021-11-10

综合照片 - 现实图像和视频是计算机图形的核心，并且是几十年的研究焦点。传统上，使用渲染算法（如光栅化或射线跟踪）生成场景的合成图像，其将几何形状和材料属性的表示为输入。统称，这些输入定义了实际场景和呈现的内容，并且被称为场景表示（其中场景由一个或多个对象组成）。示例场景表示是具有附带纹理的三角形网格（例如，由艺术家创建），点云（例如，来自深度传感器），体积网格（例如，来自CT扫描）或隐式曲面函数（例如，截短的符号距离）字段）。使用可分辨率渲染损耗的观察结果的这种场景表示的重建被称为逆图形或反向渲染。神经渲染密切相关，并将思想与经典计算机图形和机器学习中的思想相结合，以创建用于合成来自真实观察图像的图像的算法。神经渲染是朝向合成照片现实图像和视频内容的目标的跨越。近年来，我们通过数百个出版物显示了这一领域的巨大进展，这些出版物显示了将被动组件注入渲染管道的不同方式。这种最先进的神经渲染进步的报告侧重于将经典渲染原则与学习的3D场景表示结合的方法，通常现在被称为神经场景表示。这些方法的一个关键优势在于它们是通过设计的3D-一致，使诸如新颖的视点合成捕获场景的应用。除了处理静态场景的方法外，我们还涵盖了用于建模非刚性变形对象的神经场景表示...

translated by 谷歌翻译

PlenOctrees for Real-time Rendering of Neural Radiance Fields

Alex Yu , Ruilong Li , Matthew Tancik , Hao Li , Ren Ng , Angjoo Kanazawa

分类：

2021-03-25

We introduce a method to render Neural Radiance Fields (NeRFs) in real time using PlenOctrees, an octree-based 3D representation which supports view-dependent effects. Our method can render 800×800 images at more than 150 FPS, which is over 3000 times faster than conventional NeRFs. We do so without sacrificing quality while preserving the ability of NeRFs to perform free-viewpoint rendering of scenes with arbitrary geometry and view-dependent effects. Real-time performance is achieved by pre-tabulating the NeRF into a PlenOctree. In order to preserve viewdependent effects such as specularities, we factorize the appearance via closed-form spherical basis functions. Specifically, we show that it is possible to train NeRFs to predict a spherical harmonic representation of radiance, removing the viewing direction as an input to the neural network. Furthermore, we show that PlenOctrees can be directly optimized to further minimize the reconstruction loss, which leads to equal or better quality compared to competing methods. Moreover, this octree optimization step can be used to reduce the training time, as we no longer need to wait for the NeRF training to converge fully. Our real-time neural rendering approach may potentially enable new applications such as 6-DOF industrial and product visualizations, as well as next generation AR/VR systems. PlenOctrees are amenable to in-browser rendering as well; please visit the project page for the interactive online demo, as well as video and code: https://alexyu. net/plenoctrees.

translated by 谷歌翻译

ADJUST: A Dictionary-Based Joint Reconstruction and Unmixing Method for Spectral Tomography

Mathé T. Zeegers , Ajinkya Kadu , Tristan van Leeuwen , Kees Joost Batenburg

分类：计算机视觉

2021-12-21

多光谱探测器的进步导致X射线计算机断层扫描（CT）的范式偏移。从这些检测器获取的光谱信息可用于提取感兴趣对象的体积材料成分图。如果已知材料及其光谱响应是先验的，则图像重建步骤相当简单。但是，如果他们不知道，则需要共同估计地图以及响应。频谱CT中的传统工作流程涉及执行卷重建，然后进行材料分解，反之亦然。然而，这些方法本身遭受了联合重建问题的缺陷。为了解决这个问题，我们提出了一种基于词典的联合重建和解密方法的光谱断层扫描（调整）。我们的配方依赖于形成CT中常见的材料的光谱签名词典以及对象中存在的材料数的先验知识。特别地，我们在空间材料映射，光谱词典和字典元素的材料的指示符方面对光谱体积线性分解。我们提出了一种记忆有效的加速交替的近端梯度方法，以找到所得到的Bi-convex问题的近似解。根据几种合成幻影的数值示范，我们观察到与其他最先进的方法相比，调整非常好。此外，我们解决了针对有限测量模式调整的鲁棒性。

translated by 谷歌翻译

Differentiable Rendering for Pose Estimation in Proximity Operations

Ramchander Rao Bhaskara , Roshan Thomas Eapen , Manoranjan Majji

分类：计算机视觉 | 机器人

2022-12-24

Differentiable rendering aims to compute the derivative of the image rendering function with respect to the rendering parameters. This paper presents a novel algorithm for 6-DoF pose estimation through gradient-based optimization using a differentiable rendering pipeline. We emphasize two key contributions: (1) instead of solving the conventional 2D to 3D correspondence problem and computing reprojection errors, images (rendered using the 3D model) are compared only in the 2D feature space via sparse 2D feature correspondences. (2) Instead of an analytical image formation model, we compute an approximate local gradient of the rendering process through online learning. The learning data consists of image features extracted from multi-viewpoint renders at small perturbations in the pose neighborhood. The gradients are propagated through the rendering pipeline for the 6-DoF pose estimation using nonlinear least squares. This gradient-based optimization regresses directly upon the pose parameters by aligning the 3D model to reproduce a reference image shape. Using representative experiments, we demonstrate the application of our approach to pose estimation in proximity operations.

translated by 谷歌翻译

DIVeR: Real-time and Accurate Neural Radiance Fields with Deterministic Integration for Volume Rendering

Liwen Wu , Jae Yong Lee , Anand Bhattad , Yuxiong Wang , David Forsyth

分类：计算机视觉

2021-11-19

潜水员在NERF的关键思想和其变体 - 密度模型和体积渲染的关键思想中建立 - 学习可以从少量图像实际渲染的3D对象模型。与所有先前的NERF方法相比，潜水员使用确定性而不是体积渲染积分的随机估计。潜水员的表示是基于体素的功能领域。为了计算卷渲染积分，将光线分为间隔，每个体素;使用MLP的每个间隔的特征估计体渲染积分的组件，并且组件聚合。结果，潜水员可以呈现其他集成商错过的薄半透明结构。此外，潜水员的表示与其他这样的方法相比相对暴露的语义 - 在体素空间中的运动特征向量导致自然编辑。对当前最先进的方法的广泛定性和定量比较表明，潜水员产生（1）在最先进的质量或高于最先进的质量，（2）的情况下非常小而不会被烘烤，（3）在不被烘烤的情况下渲染非常快，并且（4）可以以自然方式编辑。

translated by 谷歌翻译

VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field

Naruya Kondo , Yuya Ikeda , Andrea Tagliasacchi , Yutaka Matsuo , Yoichi Ochiai , Shixiang Shane Gu

分类：计算机视觉

2021-11-25

神经辐射场（NERF）是数据驱动3D重建中的流行方法。鉴于其简单性和高质量的渲染，正在开发许多NERF应用程序。但是，NERF的大量的速度很大。许多尝试如何加速NERF培训和推理，包括复杂的代码级优化和缓存，使用复杂的数据结构以及通过多任务和元学习的摊销。在这项工作中，我们通过NERF之前通过经典技术镜头重新审视NERF的基本构建块。我们提出了Voxel-Accelated Nerf（VaxnerF），与Visual Hull集成了Nerf，一种经典的3D重建技术，只需要每张图像的二进制前景背景像素标签。可视船体，可在大约10秒内优化，可以提供粗略的现场分离，以省略NERF中的大量网络评估。我们在流行的JAXNERF Codebase提供了一个干净的全力验光，基于JAX的实现，其仅包括大约30行的代码更改和模块化视觉船体子程序，并在高度表现的JAXNERF之上实现了大约2-8倍的速度学习基线具有零劣化呈现质量。具有足够的计算，这有效地将单位训练从小时到30分钟缩小到30分钟。我们希望VAXNERF - 一种仔细组合具有深入方法的经典技术（可谓更换它） - 可以赋予并加速新的NERF扩展和应用，以其简单，可移植性和可靠的性能收益。代码在https://github.com/naruya/vaxnerf提供。

translated by 谷歌翻译

DIST: Rendering Deep Implicit Signed Distance Function with Differentiable Sphere Tracing

Shaohui Liu , Yinda Zhang , Songyou Peng , Boxin Shi , Marc Pollefeys , Zhaopeng Cui

分类：

2019-11-29

We propose a differentiable sphere tracing algorithm to bridge the gap between inverse graphics methods and the recently proposed deep learning based implicit signed distance function. Due to the nature of the implicit function, the rendering process requires tremendous function queries, which is particularly problematic when the function is represented as a neural network. We optimize both the forward and backward passes of our rendering layer to make it run efficiently with affordable memory consumption on a commodity graphics card. Our rendering method is fully differentiable such that losses can be directly computed on the rendered 2D observations, and the gradients can be propagated backwards to optimize the 3D geometry. We show that our rendering method can effectively reconstruct accurate 3D shapes from various inputs, such as sparse depth and multi-view images, through inverse optimization. With the geometry based reasoning, our 3D shape prediction methods show excellent generalization capability and robustness against various noises. * Work done while Shaohui Liu was an academic guest at ETH Zurich.

translated by 谷歌翻译