获取3D对象表示对于创建照片现实的模拟器和为AR/VR应用程序收集资产很重要。神经领域已经显示出其在学习2D图像的场景的连续体积表示方面的有效性,但是从这些模型中获取对象表示,并以较弱的监督仍然是一个开放的挑战。在本文中,我们介绍了Laterf,一种从给定的2D图像和已知相机姿势的2D图像中提取感兴趣对象的方法,对象的自然语言描述以及少数对象和非对象标签 - 输入图像中的对象点。为了忠实地从场景中提取对象,后来在每个3D点上都以其他“对象”概率扩展NERF公式。此外,我们利用预先训练的剪辑模型与我们可区分的对象渲染器相结合的丰富潜在空间来注入对象的封闭部分。我们在合成数据集和真实数据集上展示了高保真对象提取,并通过广泛的消融研究证明我们的设计选择是合理的。
translated by 谷歌翻译
In addition to its public health crisis, COVID-19 pandemic has led to the shutdown and closure of workplaces with an estimated total cost of more than $16 trillion. Given the long hours an average person spends in buildings and indoor environments, this research article proposes data-driven control strategies to design optimal indoor airflow to minimize the exposure of occupants to viral pathogens in built environments. A general control framework is put forward for designing an optimal velocity field and proximal policy optimization, a reinforcement learning algorithm is employed to solve the control problem in a data-driven fashion. The same framework is used for optimal placement of disinfectants to neutralize the viral pathogens as an alternative to the airflow design when the latter is practically infeasible or hard to implement. We show, via simulation experiments, that the control agent learns the optimal policy in both scenarios within a reasonable time. The proposed data-driven control framework in this study will have significant societal and economic benefits by setting the foundation for an improved methodology in designing case-specific infection control guidelines that can be realized by affordable ventilation devices and disinfectants.
translated by 谷歌翻译
WSD (Word Sense Disambiguation) is the task of identifying which sense of a word is meant in a sentence or other segment of text. Researchers have worked on this task (e.g. Pustejovsky, 2002) for years but it's still a challenging one even for SOTA (state-of-the-art) LMs (language models). The new dataset, TempoWiC introduced by Loureiro et al. (2022b) focuses on the fact that words change over time. Their best baseline achieves 70.33% macro-F1. In this work, we use two different losses simultaneously to train RoBERTa-based classification models. We also improve our model by using another similar dataset to generalize better. Our best configuration beats their best baseline by 4.23% and reaches 74.56% macroF1.
translated by 谷歌翻译
Computer vision and machine learning are playing an increasingly important role in computer-assisted diagnosis; however, the application of deep learning to medical imaging has challenges in data availability and data imbalance, and it is especially important that models for medical imaging are built to be trustworthy. Therefore, we propose TRUDLMIA, a trustworthy deep learning framework for medical image analysis, which adopts a modular design, leverages self-supervised pre-training, and utilizes a novel surrogate loss function. Experimental evaluations indicate that models generated from the framework are both trustworthy and high-performing. It is anticipated that the framework will support researchers and clinicians in advancing the use of deep learning for dealing with public health crises including COVID-19.
translated by 谷歌翻译
Existing statistical methods can be used to estimate a policy, or a mapping from covariates to decisions, which can then instruct decision makers. There is great interest in using such data-driven policies in healthcare. In healthcare, however, it is often important to explain to the healthcare provider, and to the patient, how a new policy differs from the current standard of care. This end is facilitated if one can pinpoint the aspects (i.e., parameters) of the policy that change most when moving from the standard of care to the new, suggested policy. To this end, we adapt ideas from Trust Region Policy Optimization. In our work, however, unlike in Trust Region Policy Optimization, the difference between the suggested policy and standard of care is required to be sparse, aiding with interpretability. In particular, we trade off between maximizing expected reward and minimizing the $L_1$ norm divergence between the parameters of the two policies. This yields "relative sparsity," where, as a function of a tuning parameter, $\lambda$, we can approximately control the number of parameters in our suggested policy that differ from their counterparts in the standard of care. We develop our methodology for the observational data setting. We propose a problem-specific criterion for selecting $\lambda$, perform simulations, and illustrate our method with a real, observational healthcare dataset, deriving a policy that is easy to explain in the context of the current standard of care. Our work promotes the adoption of data-driven decision aids, which have great potential to improve health outcomes.
translated by 谷歌翻译
Considering the spectral properties of images, we propose a new self-attention mechanism with highly reduced computational complexity, up to a linear rate. To better preserve edges while promoting similarity within objects, we propose individualized processes over different frequency bands. In particular, we study a case where the process is merely over low-frequency components. By ablation study, we show that low frequency self-attention can achieve very close or better performance relative to full frequency even without retraining the network. Accordingly, we design and embed novel plug-and-play modules to the head of a CNN network that we refer to as FsaNet. The frequency self-attention 1) takes low frequency coefficients as input, 2) can be mathematically equivalent to spatial domain self-attention with linear structures, 3) simplifies token mapping ($1\times1$ convolution) stage and token mixing stage simultaneously. We show that the frequency self-attention requires $87.29\% \sim 90.04\%$ less memory, $96.13\% \sim 98.07\%$ less FLOPs, and $97.56\% \sim 98.18\%$ in run time than the regular self-attention. Compared to other ResNet101-based self-attention networks, FsaNet achieves a new state-of-the-art result ($83.0\%$ mIoU) on Cityscape test dataset and competitive results on ADE20k and VOCaug.
translated by 谷歌翻译
由于临床实践所需的放射学报告和研究是在自由文本叙述中编写和存储的,因此很难提取相对信息进行进一步分析。在这种情况下,自然语言处理(NLP)技术可以促进自动信息提取和自由文本格式转换为结构化数据。近年来,基于深度学习(DL)的模型已适用于NLP实验,并具有令人鼓舞的结果。尽管基于人工神经网络(ANN)和卷积神经网络(CNN)的DL模型具有显着潜力,但这些模型仍面临临床实践中实施的一些局限性。变形金刚是另一种新的DL体系结构,已越来越多地用于改善流程。因此,在这项研究中,我们提出了一种基于变压器的细粒命名实体识别(NER)架构,以进行临床信息提取。我们以自由文本格式收集了88次腹部超声检查报告,并根据我们开发的信息架构进行了注释。文本到文本传输变压器模型(T5)和covive是T5模型的预训练域特异性适应性,用于微调来提取实体和关系,并将输入转换为结构化的格式。我们在这项研究中基于变压器的模型优于先前应用的方法,例如基于Rouge-1,Rouge-2,Rouge-L和BLEU分别为0.816、0.668、0.528和0.743的ANN和CNN模型,同时提供了一个分数可解释的结构化报告。
translated by 谷歌翻译
作为其核心计算,一种自我发挥的机制可以在整个输入序列上分配成对相关性。尽管表现良好,但计算成对相关性的成本高昂。尽管最近的工作表明了注意力分数低的元素的运行时间修剪的好处,但自我发挥机制的二次复杂性及其芯片内存能力的需求被忽略了。这项工作通过构建一个称为Sprint的加速器来解决这些约束,该加速器利用RERAM横杆阵列的固有并行性以近似方式计算注意力分数。我们的设计使用RERAM内的轻质模拟阈值电路来降低注意力评分,从而使Sprint只能获取一小部分相关数据到芯片内存。为了减轻模型准确性的潜在负面影响,Sprint重新计算数字中少数获取数据的注意力评分。相关注意分数的组合内修剪和片上重新计算可以将Sprint转化为仅线性的二次复杂性。此外,我们即使修剪后,我们也可以识别并利用相邻的注意操作之间的动态空间位置,从而消除了昂贵但冗余的数据获取。我们在各种最新的变压器模型上评估了我们提出的技术。平均而言,当使用总16KB芯片内存时,Sprint会产生7.5倍的速度和19.6倍的能量,而实际上与基线模型的等值级相当(平均为0.36%的降级)。
translated by 谷歌翻译
几种慢性肺疾病,例如特发性肺纤维化(IPF)的特征是气道异常扩张。计算机断层扫描(CT)上气道特征的定量可以帮助表征疾病进展。已经开发了基于物理的气道测量算法,但由于在临床实践中看到的气道形态多样性,因此取得了有限的成功。由于获得精确的气道注释的高成本,监督学习方法也不可行。我们建议使用感知损失通过样式转移进行综合气道,以训练我们的模型气道转移网络(ATN)。我们使用a)定性评估将ATN模型与最先进的GAN网络(SIMGAN)进行比较; b)评估基于ATN和SIMGAN的CT气道指标预测113例IPF患者死亡率的能力。与Simgan相比,ATN被证明更快,更容易训练。还发现基于ATN的气道测量值始终比IPF CTS上的SIMGAN衍生气道指标更强大。通过转化网络使用感知损失来完善合成数据的转化网络是基于GAN的方法的现实替代方法,用于用于特发性肺纤维化的临床CT分析。我们的源代码可以在https://github.com/ashkanpakzad/atn上找到,该源代码与Airquant的现有开放源气道分析框架兼容。
translated by 谷歌翻译
时间序列模型通常处理极端事件和异常,这两者都在现实世界数据集中普遍存在。这样的模型通常需要提供仔细的概率预测,这对于诸如飓风和大流行等极端事件的风险管理至关重要。但是,自动检测并学习对大规模数据集使用极端事件和异常,这是一项挑战,这通常需要手动努力。因此,我们提出了一个异常的预测框架,该框架利用了先前看到的异常作用来提高其在极端事件存在期间和之后的预测准确性。具体而言,该框架会自动提取异常,并通过注意机制将其合并,以提高其未来极端事件的准确性。此外,该框架采用动态不确定性优化算法,以在线方式降低预测的不确定性。所提出的框架表现出一致的卓越精度,而在三个数据集上,与当前预测模型相比,三个具有不同异常的数据集的不确定性。
translated by 谷歌翻译