扩散模型是一类生成模型,与其他生成模型相比,在自然图像数据集训练时,在创建逼真的图像时表现出了出色的性能。我们引入了Dispr,这是一个基于扩散的模型,用于解决从二维(2D)单细胞显微镜图像预测三维(3D)细胞形状的反问题。使用2D显微镜图像作为先验,因此可以根据预测现实的3D形状重建条件。为了在基于功能的单细胞分类任务中展示DIPPR作为数据增强工具的适用性,我们从分组为六个高度不平衡类的单元中提取形态特征。将DISPR预测的功能添加到三个少数类别,将宏F1分数从$ f1_ \ text {macro} = 55.2 \ pm 4.6 \%$ to $ f1_ \%$ to $ f1_ \ text {macro} = 72.2 \ pm 4.9 \%$。由于我们的方法是在这种情况下第一个采用基于扩散的模型的方法,因此我们证明了扩散模型可以应用于3D中的反问题,并且他们学会了从2D显微镜图像中重建具有现实的形态特征的3D形状。
translated by 谷歌翻译
缺乏培训数据和实例级注释,对罕见性贫血疾病的深度分类受到挑战。多个实例学习(MIL)已证明是一个有效的解决方案,但其精度较低和有限的解释性遭受。尽管关注机制的包含已经解决了这些问题,但它们的有效性很大程度上取决于训练样本中细胞的数量和多样性。因此,从血液样本中罕见的贫血障碍分类的机器学习表现不佳。在本文中,我们提出了一种可解释的合并方法,以解决这些局限性。通过从负面袋的实例级信息(即,来自健康个体的均质细胞)中受益,我们的方法增加了异常实例的贡献。我们表明,我们的战略优于标准的MIL分类算法,并在决策背后提供了有意义的解释。此外,它可以表示在训练阶段未见的罕见血液疾病的异常情况。
translated by 谷歌翻译
诊断血液系统恶性肿瘤需要鉴定和分类外周血涂片中的白细胞。由不同的实验室程序,染色,照明和显微镜设置引起的域移位阻碍了最近开发的机器学习方法对从不同站点收集的数据的重复性。在这里,我们提出了一个跨域改编的自动编码器,以在三个不同的白色血细胞中从外周血涂片扫描的单个白细胞的三个不同数据集中提取特征。自动编码器基于R-CNN架构,使其专注于相关的白色血细胞并消除图像中的伪影。为了评估提取特征的质量,我们使用简单的随机森林对单个细胞进行分类。我们表明,由于仅在一个数据集中训练的自动编码器提取的丰富功能,随机森林分类器在看不见的数据集上表现出色,并且在交叉域任务中超越了Oracle网络。我们的结果表明,可以在更复杂的诊断和预后任务中采用这种无监督的方法,而无需添加昂贵的专家标签来看不见数据。
translated by 谷歌翻译
灯场荧光显微镜(LSFM)是一种尖端的体积成像技术,可允许对具有脱钩照明和检测路径的介观样品进行三维成像。尽管这种显微镜的选择性激发方案提供了固有的光学截面,可最大程度地减少荧光外荧光背景和样品光损伤,但它容易吸收光吸收和散射效果,从而导致图像中的照明和剥离伪像不利。为了解决这个问题,在本文中,我们在LSFM中提出了一种称为Destripe的盲条形伪像去除算法,该算法将自我观察的时尚图形神经网络与展开的Hessian Prior结合在一起。具体而言,受到傅立叶变换在频域中将条带信息凝结成孤立值的理想特性的启发,Destripe首先通过利用单向条纹伪像和更多的异位前景图像之间的结构差异来定位潜在损坏的傅里叶系数。然后可以将受影响的傅立叶系数送入图形神经网络中以恢复,并在海上正规化露下,以进一步确保标准图像空间中的结构得到很好的保存。由于在现实,无条纹的LSFM几乎没有标准图像采集协议中,因此Destripe配备了一个自我2的自我剥夺损失术语,可以消除伪像,而无需访问无条纹的地面真相图像。竞争性的实验结果表明,毁灭性的有效性在通过合成和真实条纹伪像的LSFM中恢复了损坏的生物标志物。
translated by 谷歌翻译
从2D图像重建3D对象对于我们的大脑和机器学习算法都有挑战。为了支持此空间推理任务,有关对象整体形状的上下文信息至关重要。但是,此类信息不会通过既定的损失条款(例如骰子损失)捕获。我们建议通过在重建损失中包括多尺度拓扑特征,例如连接的组件,周期和空隙来补充几何形状信息。我们的方法使用立方复合物来计算3D体积数据的拓扑特征,并采用最佳传输距离来指导重建过程。这种拓扑感知的损失是完全可区分的,在计算上有效,并且可以添加到任何神经网络中。我们通过将损失纳入SHAPR来证明我们的损失的实用性,该模型用于根据2D显微镜图像预测单个细胞的3D细胞形状。使用利用单个对象的几何信息和拓扑信息来评估其形状的混合损失,我们发现拓扑信息大大提高了重建质量,从而突出了其从图像数据集中提取更多相关特征的能力。
translated by 谷歌翻译
Neural networks are increasingly applied in safety critical domains, their verification thus is gaining importance. A large class of recent algorithms for proving input-output relations of feed-forward neural networks are based on linear relaxations and symbolic interval propagation. However, due to variable dependencies, the approximations deteriorate with increasing depth of the network. In this paper we present DPNeurifyFV, a novel branch-and-bound solver for ReLU networks with low dimensional input-space that is based on symbolic interval propagation with fresh variables and input-splitting. A new heuristic for choosing the fresh variables allows to ameliorate the dependency problem, while our novel splitting heuristic, in combination with several other improvements, speeds up the branch-and-bound procedure. We evaluate our approach on the airborne collision avoidance networks ACAS Xu and demonstrate runtime improvements compared to state-of-the-art tools.
translated by 谷歌翻译
Insects as pollinators play a key role in ecosystem management and world food production. However, insect populations are declining, calling for a necessary global demand of insect monitoring. Existing methods analyze video or time-lapse images of insects in nature, but the analysis is challenging since insects are small objects in complex and dynamic scenes of natural vegetation. The current paper provides a dataset of primary honeybees visiting three different plant species during two months of summer-period. The dataset consists of more than 700,000 time-lapse images from multiple cameras, including more than 100,000 annotated images. The paper presents a new method pipeline for detecting insects in time-lapse RGB-images. The pipeline consists of a two-step process. Firstly, the time-lapse RGB-images are preprocessed to enhance insects in the images. We propose a new prepossessing enhancement method: Motion-Informed-enhancement. The technique uses motion and colors to enhance insects in images. The enhanced images are subsequently fed into a Convolutional Neural network (CNN) object detector. Motion-Informed-enhancement improves the deep learning object detectors You Only Look Once (YOLO) and Faster Region-based Convolutional Neural Networks (Faster R-CNN). Using Motion-Informed-enhancement the YOLO-detector improves average micro F1-score from 0.49 to 0.71, and the Faster R-CNN-detector improves average micro F1-score from 0.32 to 0.56 on the our dataset. Our datasets are published on: https://vision.eng.au.dk/mie/
translated by 谷歌翻译
Reliable application of machine learning-based decision systems in the wild is one of the major challenges currently investigated by the field. A large portion of established approaches aims to detect erroneous predictions by means of assigning confidence scores. This confidence may be obtained by either quantifying the model's predictive uncertainty, learning explicit scoring functions, or assessing whether the input is in line with the training distribution. Curiously, while these approaches all state to address the same eventual goal of detecting failures of a classifier upon real-life application, they currently constitute largely separated research fields with individual evaluation protocols, which either exclude a substantial part of relevant methods or ignore large parts of relevant failure sources. In this work, we systematically reveal current pitfalls caused by these inconsistencies and derive requirements for a holistic and realistic evaluation of failure detection. To demonstrate the relevance of this unified perspective, we present a large-scale empirical study for the first time enabling benchmarking confidence scoring functions w.r.t all relevant methods and failure sources. The revelation of a simple softmax response baseline as the overall best performing method underlines the drastic shortcomings of current evaluation in the abundance of publicized research on confidence scoring. Code and trained models are at https://github.com/IML-DKFZ/fd-shifts.
translated by 谷歌翻译
Pre-trained protein language models have demonstrated significant applicability in different protein engineering task. A general usage of these pre-trained transformer models latent representation is to use a mean pool across residue positions to reduce the feature dimensions to further downstream tasks such as predicting bio-physics properties or other functional behaviours. In this paper we provide a two-fold contribution to machine learning (ML) driven drug design. Firstly, we demonstrate the power of sparsity by promoting penalization of pre-trained transformer models to secure more robust and accurate melting temperature (Tm) prediction of single-chain variable fragments with a mean absolute error of 0.23C. Secondly, we demonstrate the power of framing our prediction problem in a probabilistic framework. Specifically, we advocate for the need of adopting probabilistic frameworks especially in the context of ML driven drug design.
translated by 谷歌翻译
传统上,无监督的情感分析是通过计算存储在情感词典中的文本中的这些词,然后根据注册正面和否定词的比例分配标签的文字来执行的。尽管这些“计数”方法被认为是有益的,因为它们确定性地对文本进行评分,但当分析的文本简短或词汇与词典认为默认值的情况不同时,它们的分类率降低。本文提出的称为LEX2SENT的模型是一种无监督的情感分析方法,用于改善情感词典方法的分类。为此,对DOC2VEC模型进行了训练,以确定嵌入文档嵌入与情感词典正面和负部分的嵌入之间的距离。然后对这些距离进行评估,以在重新采样文档上多次执行DOC2VEC,并进行平均以执行分类任务。对于本文考虑的三个基准数据集,拟议的LEX2SENT优于每个评估的词典,包括Vader等最先进的词典或分类率的意见词典。
translated by 谷歌翻译