Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the bio-medical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect the annotation entity's interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of Peak Ground Truth (PGT) is introduced. PGT marks the point beyond which an increase in similarity with the reference annotation stops translating to better Real World Model Performance (RWMP). Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, three categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
translated by 谷歌翻译
Segmenting the fine structure of the mouse brain on magnetic resonance (MR) images is critical for delineating morphological regions, analyzing brain function, and understanding their relationships. Compared to a single MRI modality, multimodal MRI data provide complementary tissue features that can be exploited by deep learning models, resulting in better segmentation results. However, multimodal mouse brain MRI data is often lacking, making automatic segmentation of mouse brain fine structure a very challenging task. To address this issue, it is necessary to fuse multimodal MRI data to produce distinguished contrasts in different brain structures. Hence, we propose a novel disentangled and contrastive GAN-based framework, named MouseGAN++, to synthesize multiple MR modalities from single ones in a structure-preserving manner, thus improving the segmentation performance by imputing missing modalities and multi-modality fusion. Our results demonstrate that the translation performance of our method outperforms the state-of-the-art methods. Using the subsequently learned modality-invariant information as well as the modality-translated images, MouseGAN++ can segment fine brain structures with averaged dice coefficients of 90.0% (T2w) and 87.9% (T1w), respectively, achieving around +10% performance improvement compared to the state-of-the-art algorithms. Our results demonstrate that MouseGAN++, as a simultaneous image synthesis and segmentation method, can be used to fuse cross-modality information in an unpaired manner and yield more robust performance in the absence of multimodal data. We release our method as a mouse brain structural segmentation tool for free academic usage at https://github.com/yu02019.
translated by 谷歌翻译
灯场荧光显微镜(LSFM)是一种尖端的体积成像技术,可允许对具有脱钩照明和检测路径的介观样品进行三维成像。尽管这种显微镜的选择性激发方案提供了固有的光学截面,可最大程度地减少荧光外荧光背景和样品光损伤,但它容易吸收光吸收和散射效果,从而导致图像中的照明和剥离伪像不利。为了解决这个问题,在本文中,我们在LSFM中提出了一种称为Destripe的盲条形伪像去除算法,该算法将自我观察的时尚图形神经网络与展开的Hessian Prior结合在一起。具体而言,受到傅立叶变换在频域中将条带信息凝结成孤立值的理想特性的启发,Destripe首先通过利用单向条纹伪像和更多的异位前景图像之间的结构差异来定位潜在损坏的傅里叶系数。然后可以将受影响的傅立叶系数送入图形神经网络中以恢复,并在海上正规化露下,以进一步确保标准图像空间中的结构得到很好的保存。由于在现实,无条纹的LSFM几乎没有标准图像采集协议中,因此Destripe配备了一个自我2的自我剥夺损失术语,可以消除伪像,而无需访问无条纹的地面真相图像。竞争性的实验结果表明,毁灭性的有效性在通过合成和真实条纹伪像的LSFM中恢复了损坏的生物标志物。
translated by 谷歌翻译
经典的多个实例学习(MIL)方法通常基于实例之间的相同和独立的分布式假设,因此忽略了个人实体以外的潜在丰富的上下文信息。另一方面,已经提出了具有全球自我发场模块的变压器来对所有实例之间的相互依赖性进行建模。但是,在本文中,我们质疑:是否需要使用自我注意力进行全球关系建模,或者我们是否可以适当地将自我注意计算限制为大规模整个幻灯片图像(WSIS)中的本地制度?我们为MIL(LA-MIL)提出了一个通用的基于局部注意力图的变压器,通过在自适应局部任意大小的自适应局部方案中明确化情境化实例,从而引入了归纳偏见。此外,有效适应的损失函数使我们可以学习表达性WSI嵌入的方法,以进行多种生物标志物的联合分析。我们证明,LA-MIL实现了最新的胃肠癌预测,从而超过了重要生物标志物(例如微卫星不稳定性的结直肠癌)的现有模型。我们的发现表明,本地自我注意力足够模型与全球模块相同的依赖性。我们的LA-MIL实施可从https://github.com/agentdr1/la_mil获得。
translated by 谷歌翻译
用于核细胞分割的注释显微镜图像是费力且耗时的。为了利用少数现有的注释,也跨越多种方式,我们提出了一种基于生成的对抗网络(GAN)的新型显微镜式增强技术。与其他风格转移方法不同,它不仅可以处理不同的细胞测定类型和照明条件,还可以与不同的成像方式,例如亮场和荧光显微镜。使用Disentangled表示的内容和风格,我们可以在增强期间改变其风格的同时保留原始图像的结构。我们在2018年数据科学碗数据集上评估我们的数据增强,包括各种细胞测定,照明条件和成像方式。凭借我们的增强,竞争中两个排名排名蒙版R-CNN的核细胞分割算法的分割精度显着增加。因此,我们的增强技术使下游任务更加强大地对测试数据异质性,并有助于抵消类别不平衡而不重新采样少数类。
translated by 谷歌翻译
In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient style patterns, the content details may be damaged and sometimes the objects of images can not be distinguished clearly. For this reason, we present a new transformer-based method named STT for image style transfer and an edge loss which can enhance the content details apparently to avoid generating blurred results for excessive rendering on style features. Qualitative and quantitative experiments demonstrate that STT achieves comparable performance to state-of-the-art image style transfer methods while alleviating the content leak problem.
translated by 谷歌翻译
In recent years, the Transformer architecture has shown its superiority in the video-based person re-identification task. Inspired by video representation learning, these methods mainly focus on designing modules to extract informative spatial and temporal features. However, they are still limited in extracting local attributes and global identity information, which are critical for the person re-identification task. In this paper, we propose a novel Multi-Stage Spatial-Temporal Aggregation Transformer (MSTAT) with two novel designed proxy embedding modules to address the above issue. Specifically, MSTAT consists of three stages to encode the attribute-associated, the identity-associated, and the attribute-identity-associated information from the video clips, respectively, achieving the holistic perception of the input person. We combine the outputs of all the stages for the final identification. In practice, to save the computational cost, the Spatial-Temporal Aggregation (STA) modules are first adopted in each stage to conduct the self-attention operations along the spatial and temporal dimensions separately. We further introduce the Attribute-Aware and Identity-Aware Proxy embedding modules (AAP and IAP) to extract the informative and discriminative feature representations at different stages. All of them are realized by employing newly designed self-attention operations with specific meanings. Moreover, temporal patch shuffling is also introduced to further improve the robustness of the model. Extensive experimental results demonstrate the effectiveness of the proposed modules in extracting the informative and discriminative information from the videos, and illustrate the MSTAT can achieve state-of-the-art accuracies on various standard benchmarks.
translated by 谷歌翻译
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural rendering. Motivated by the fact that informative point cloud features should be able to encode rich geometry and appearance cues and render realistic images, we train a point-cloud encoder within a devised point-based neural renderer by comparing the rendered images with real images on massive RGB-D data. The learned point-cloud encoder can be easily integrated into various downstream tasks, including not only high-level tasks like 3D detection and segmentation, but low-level tasks like 3D reconstruction and image synthesis. Extensive experiments on various tasks demonstrate the superiority of our approach compared to existing pre-training methods.
translated by 谷歌翻译
Collaboration among industrial Internet of Things (IoT) devices and edge networks is essential to support computation-intensive deep neural network (DNN) inference services which require low delay and high accuracy. Sampling rate adaption which dynamically configures the sampling rates of industrial IoT devices according to network conditions, is the key in minimizing the service delay. In this paper, we investigate the collaborative DNN inference problem in industrial IoT networks. To capture the channel variation and task arrival randomness, we formulate the problem as a constrained Markov decision process (CMDP). Specifically, sampling rate adaption, inference task offloading and edge computing resource allocation are jointly considered to minimize the average service delay while guaranteeing the long-term accuracy requirements of different inference services. Since CMDP cannot be directly solved by general reinforcement learning (RL) algorithms due to the intractable long-term constraints, we first transform the CMDP into an MDP by leveraging the Lyapunov optimization technique. Then, a deep RL-based algorithm is proposed to solve the MDP. To expedite the training process, an optimization subroutine is embedded in the proposed algorithm to directly obtain the optimal edge computing resource allocation. Extensive simulation results are provided to demonstrate that the proposed RL-based algorithm can significantly reduce the average service delay while preserving long-term inference accuracy with a high probability.
translated by 谷歌翻译
The traditional statistical inference is static, in the sense that the estimate of the quantity of interest does not affect the future evolution of the quantity. In some sequential estimation problems however, the future values of the quantity to be estimated depend on the estimate of its current value. This type of estimation problems has been formulated as the dynamic inference problem. In this work, we formulate the Bayesian learning problem for dynamic inference, where the unknown quantity-generation model is assumed to be randomly drawn according to a random model parameter. We derive the optimal Bayesian learning rules, both offline and online, to minimize the inference loss. Moreover, learning for dynamic inference can serve as a meta problem, such that all familiar machine learning problems, including supervised learning, imitation learning and reinforcement learning, can be cast as its special cases or variants. Gaining a good understanding of this unifying meta problem thus sheds light on a broad spectrum of machine learning problems as well.
translated by 谷歌翻译