智能论文笔记

Multi-Image Visual Question Answering

Harsh Raj , Janhavi Dadhania , Akhilesh Bhardwaj

分类：计算机视觉 | 人工智能 | 机器学习

2021-12-27

虽然在开发模型上做了很多工作来解决视觉问题的问题的问题，但这些模型将问题与图像功能相关的能力仍然不那么探索。我们介绍了不同损耗功能的不同特征提取方法的实证研究。我们为视觉问题的任务提出了新的数据集，其中多个图像输入只有一个地面真理，并在它们上基准测试我们的结果。我们的最终模型利用Reset + RCNN图像特征和BERT Embedings，灵感来自堆叠注意力网络，在Clever + Tinyimagenet数据集中提供39％的字精度和99％的图像精度。

translated by 谷歌翻译

Weakly-Supervised Deep Learning Model for Prostate Cancer Diagnosis and Gleason Grading of Histopathology Images

Mohammad Mahdi Behzadi , Mohammad Madani , Hanzhang Wang , Jun Bai , Ankit Bhardwaj , Anna Tarakanova , Harold Yamase , Ga Hie Nam , Sheida Nabavi

分类：计算机视觉

2022-12-25

Prostate cancer is the most common cancer in men worldwide and the second leading cause of cancer death in the United States. One of the prognostic features in prostate cancer is the Gleason grading of histopathology images. The Gleason grade is assigned based on tumor architecture on Hematoxylin and Eosin (H&E) stained whole slide images (WSI) by the pathologists. This process is time-consuming and has known interobserver variability. In the past few years, deep learning algorithms have been used to analyze histopathology images, delivering promising results for grading prostate cancer. However, most of the algorithms rely on the fully annotated datasets which are expensive to generate. In this work, we proposed a novel weakly-supervised algorithm to classify prostate cancer grades. The proposed algorithm consists of three steps: (1) extracting discriminative areas in a histopathology image by employing the Multiple Instance Learning (MIL) algorithm based on Transformers, (2) representing the image by constructing a graph using the discriminative patches, and (3) classifying the image into its Gleason grades by developing a Graph Convolutional Neural Network (GCN) based on the gated attention mechanism. We evaluated our algorithm using publicly available datasets, including TCGAPRAD, PANDA, and Gleason 2019 challenge datasets. We also cross validated the algorithm on an independent dataset. Results show that the proposed model achieved state-of-the-art performance in the Gleason grading task in terms of accuracy, F1 score, and cohen-kappa. The code is available at https://github.com/NabaviLab/Prostate-Cancer.

translated by 谷歌翻译

In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision

Gourav Datta , Zeyu Liu , Md Abdullah-Al Kaiser , Souvik Kundu , Joe Mathai , Zihan Yin , Ajey P. Jacob , Akhilesh R. Jaiswal , Peter A. Beerel

分类：计算机视觉

2022-12-21

Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment and increasing spiking activity and, consequently, energy consumption. Recent works proposed direct encoding that directly feeds the analog pixel values in the first layer of the SNN in order to significantly reduce the number of time steps. Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy. To mitigate this concern, we propose an in-sensor computing hardware-software co-design framework for SNNs targeting image recognition tasks. Our approach reduces the bandwidth between sensing and processing by 12-96x and the resulting total energy by 2.32x compared to traditional CV processing, with a 3.8% reduction in accuracy on ImageNet.

translated by 谷歌翻译

Scalable Pathogen Detection from Next Generation DNA Sequencing with Deep Learning

Sai Narayanan , Sathyanarayanan N. Aakur , Priyadharsini Ramamurthy , Arunkumar Bagavathi , Vishalini Ramnath , Akhilesh Ramachandran

分类：机器学习

2022-11-30

Next-generation sequencing technologies have enhanced the scope of Internet-of-Things (IoT) to include genomics for personalized medicine through the increased availability of an abundance of genome data collected from heterogeneous sources at a reduced cost. Given the sheer magnitude of the collected data and the significant challenges offered by the presence of highly similar genomic structure across species, there is a need for robust, scalable analysis platforms to extract actionable knowledge such as the presence of potentially zoonotic pathogens. The emergence of zoonotic diseases from novel pathogens, such as the influenza virus in 1918 and SARS-CoV-2 in 2019 that can jump species barriers and lead to pandemic underscores the need for scalable metagenome analysis. In this work, we propose MG2Vec, a deep learning-based solution that uses the transformer network as its backbone, to learn robust features from raw metagenome sequences for downstream biomedical tasks such as targeted and generalized pathogen detection. Extensive experiments on four increasingly challenging, yet realistic diagnostic settings, show that the proposed approach can help detect pathogens from uncurated, real-world clinical samples with minimal human supervision in the form of labels. Further, we demonstrate that the learned representations can generalize to completely unrelated pathogens across diseases and species for large-scale metagenome analysis. We provide a comprehensive evaluation of a novel representation learning framework for metagenome-based disease diagnostics with deep learning and provide a way forward for extracting and using robust vector representations from low-cost next generation sequencing to develop generalizable diagnostic tools.

translated by 谷歌翻译

ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data

Tengyang Xie , Mohak Bhardwaj , Nan Jiang , Ching-An Cheng

分类：机器学习 | 人工智能

2022-11-08

We propose a new model-based offline RL framework, called Adversarial Models for Offline Reinforcement Learning (ARMOR), which can robustly learn policies to improve upon an arbitrary baseline policy regardless of data coverage. Based on the concept of relative pessimism, ARMOR is designed to optimize for the worst-case relative performance when facing uncertainty. In theory, we prove that the learned policy of ARMOR never degrades the performance of the baseline policy with any admissible hyperparameter, and can learn to compete with the best policy within data coverage when the hyperparameter is well tuned, and the baseline policy is supported by the data. Such a robust policy improvement property makes ARMOR especially suitable for building real-world learning systems, because in practice ensuring no performance degradation is imperative before considering any benefit learning can bring.

translated by 谷歌翻译

Predicting Treatment Adherence of Tuberculosis Patients at Scale

Mihir Kulkarni , Satvik Golechha , Rishi Raj , Jithin Sreedharan , Ankit Bhardwaj , Santanu Rathod , Bhavin Vadera , Jayakrishna Kurada , Sanjay Mattoo , Rajendra Joshi

分类：机器学习 | 人工智能

2022-11-05

Tuberculosis (TB), an infectious bacterial disease, is a significant cause of death, especially in low-income countries, with an estimated ten million new cases reported globally in $2020$. While TB is treatable, non-adherence to the medication regimen is a significant cause of morbidity and mortality. Thus, proactively identifying patients at risk of dropping off their medication regimen enables corrective measures to mitigate adverse outcomes. Using a proxy measure of extreme non-adherence and a dataset of nearly $700,000$ patients from four states in India, we formulate and solve the machine learning (ML) problem of early prediction of non-adherence based on a custom rank-based metric. We train ML models and evaluate against baselines, achieving a $\sim 100\%$ lift over rule-based baselines and $\sim 214\%$ over a random classifier, taking into account country-wide large-scale future deployment. We deal with various issues in the process, including data quality, high-cardinality categorical data, low target prevalence, distribution shift, variation across cohorts, algorithmic fairness, and the need for robustness and explainability. Our findings indicate that risk stratification of non-adherent patients is a viable, deployable-at-scale ML solution.

translated by 谷歌翻译

Unsupervised Early Exit in DNNs with Multiple Exits

Hari Narayan N U , Manjesh K. Hanawal , Avinash Bhardwaj

分类：机器学习 | 人工智能 | 自然语言处理

2022-09-20

深神经网络（DNN）通常被设计为依次级联的可区分块/层，其预测模块仅连接到其最后一层。 DNN可以与沿主链的多个点的预测模块相连，其中推理可以在中间阶段停止而无需通过所有模块。最后一个退出点可能会提供更好的预测错误，但还涉及更多的计算资源和延迟。就预测误差和成本而言，一个“最佳”的出口是可取的。最佳出口点可能取决于任务的潜在分布，并且可能会从一个任务类型变为另一种任务类型。在神经推断期间，实例的基础真理可能无法获得，并且每个出口点的错误率无法估算。因此，人们面临在无监督环境中选择最佳出口的问题。先前的工作在离线监督设置中解决了此问题，假设可以使用足够的标记数据来估计每个出口点的错误率并调整参数以提高准确性。但是，经过预训练的DNN通常被部署在新领域中，可能无法提供大量的地面真相。我们将退出选择的问题建模为无监督的在线学习问题，并使用匪徒理论来识别最佳出口点。具体而言，我们专注于弹性BERT，这是一种预先训练的多EXIT DNN，以证明它“几乎”满足了强大的优势（SD）属性，从而可以在不知道地面真相标签的情况下学习在线设置中的最佳出口。我们开发了名为UEE-UCB的基于上限（UCB）的上限（UCB）算法，该算法可证明在SD属性下实现了子线性后悔。因此，我们的方法提供了一种自适应学习多种exit DNN中特定于域特异性的最佳出口点的方法。我们从IMDB和Yelp数据集上进行了验证算法验证我们的算法。

translated by 谷歌翻译

Computer vision based vehicle tracking as a complementary and scalable approach to RFID tagging

Pranav Kant Gaur , Abhilash Bhardwaj , Pritam Shete , Mohini Laghate , Dinesh M Sarode

分类：计算机视觉

2022-09-13

传入/传出车辆的记录是根本原因分析的关键信息，以打击各种敏感组织中的安全违规事件。 RFID标记会阻碍物流和技术方面的车辆跟踪解决方案的可扩展性。例如，要求标记为RFID的每个传入车辆（部门或私人）是严重的限制，并且与RFID一起检测异常车辆运动的视频分析是不平凡的。我们利用公开可用的计算机视觉算法实现，使用有限状态机形式主义开发可解释的车辆跟踪算法。国家机器将用于状态转换的级联对象检测和光学特征识别（OCR）模型中的输入。我们从系统部署站点中评估了75个285辆车的视频片段中提出的方法。我们观察到检测率受速度和车辆类型的影响最大。当车辆运动仅限于在检查点类似于RFID标记的检查点时，将达到最高的检测率。我们进一步分析了700个对Live DATA的车辆跟踪预测，并确定大多数车辆数量预测误差是由于无法辨认的文本，图像布鲁尔，文本遮挡，文本遮挡和vecab外字母引起的。为了进行系统部署和性能增强，我们希望我们正在进行的系统监控能够提供证据，以在安全检查点上建立更高的车辆通知SOP，并将已部署的计算机视觉模型和状态模型的微调驱动为建立拟议的方法作为RFID标记的有希望的替代方法。

translated by 谷歌翻译

A Thermal Machine Learning Solver For Chip Simulation

Rishikesh Ranade , Haiyang He , Jay Pathak , Norman Chang , Akhilesh Kumar , Jimin Wen

分类：机器学习

2022-09-10

热分析在不同的温度场景下提供了对电子芯片行为的更深入见解，并可以更快地设计探索。但是，使用FEM或CFD，在芯片上获得详细而准确的热曲线非常耗时。因此，迫切需要加快片上热溶液以解决各种系统方案。在本文中，我们提出了一个热机学习（ML）求解器，以加快芯片的热模拟。热ML-Solver是最近的新型方法CoAemlSim（可组合自动编码器的机器学习模拟器）的扩展，并对溶液算法进行了修改，以处理常数和分布式HTC。在不同情况下，针对商业求解器（例如ANSYS MAPDL）以及最新的ML基线UNET验证了所提出的方法，以证明其增强的准确性，可伸缩性和概括性。

translated by 谷歌翻译

Persuasion Strategies in Advertisements: Dataset, Modeling, and Baselines

Yaman Kumar Singla , Rajat Jha , Arunim Gupta , Milan Aggarwal , Aditya Garg , Ayush Bhardwaj , Tushar , Balaji Krishnamurthy , Rajiv Ratn Shah , Changyou Chen

分类：自然语言处理 | 计算机视觉

2022-08-20

建模是什么使广告有说服力的原因，即引起消费者的所需响应，对于宣传，社会心理学和营销的研究至关重要。尽管其重要性，但计算机视觉中说服力的计算建模仍处于起步阶段，这主要是由于缺乏可以提供与ADS相关的说服力标签的基准数据集。由社会心理学和市场营销中的说服文学的激励，我们引入了广泛的说服策略词汇，并建立了用说服策略注释的第一个AD图像语料库。然后，我们通过多模式学习制定说服策略预测的任务，在该任务中，我们设计了一个多任务注意融合模型，该模型可以利用其他广告理解的任务来预测说服策略。此外，我们对30家财富500家公司的1600个广告活动进行了真实的案例研究，我们使用模型的预测来分析哪些策略与不同的人口统计学（年龄和性别）一起使用。该数据集还提供图像分割掩码，该蒙版在测试拆分上标记了相应的AD图像中的说服力策略。我们公开发布代码和数据集https://midas-research.github.io/persuasion-avertisements/。

translated by 谷歌翻译