智能论文笔记

Real-time Emotion and Gender Classification using Ensemble CNN

Abhinav Lahariya , Varsha Singh , Uma Shanker Tiwary

分类：计算机视觉

2021-11-15

分析对人脸上的表达在识别人的情绪和行为方面发挥着非常重要的作用。识别这些表达式会自动导致自然人机接口的重要组成部分。因此，该领域的研究在生物公制认证，监控系统，情感到各种社交媒体平台中的情感方面具有广泛的应用。另一个申请包括进行客户满意度调查。正如我们所知，大型公司使巨额投资获得反馈并进行调查，但未能获得公平的反应。通过面部手势的情感和性别识别是一种技术，旨在通过他们的评价监测客户行为来改善产品和服务性能。在过去几年中，在特征提取机制，面部检测和表达分类技术方面已经进行了各种各样的进展。本文是实施一个用于构建可以检测到人的情绪和性别的实时系统的集合CNN。实验结果表明，在FER-2013 DataSet上的7个课程（愤怒，恐惧，悲伤，快乐，惊喜，中立，中立，厌恶）和IMDB数据集上的性别分类（男性或女性）的95％，精度为68％的准确性。我们的工作可以预测单一面部图像以及多个面部图像的情感和性别。此外，当通过网络摄像头给出输入时，我们的完整流水线可以花费小于0.5秒才能生成结果。

translated by 谷歌翻译

Collective Intelligent Strategy for Improved Segmentation of COVID-19 from CT

Surochita Pal Das , Sushmita Mitra , B. Uma Shankar

分类：计算机视觉

2022-12-23

The devastation caused by the coronavirus pandemic makes it imperative to design automated techniques for a fast and accurate detection. We propose a novel non-invasive tool, using deep learning and imaging, for delineating COVID-19 infection in lungs. The Ensembling Attention-based Multi-scaled Convolution network (EAMC), employing Leave-One-Patient-Out (LOPO) training, exhibits high sensitivity and precision in outlining infected regions along with assessment of severity. The Attention module combines contextual with local information, at multiple scales, for accurate segmentation. Ensemble learning integrates heterogeneity of decision through different base classifiers. The superiority of EAMC, even with severe class imbalance, is established through comparison with existing state-of-the-art learning models over four publicly-available COVID-19 datasets. The results are suggestive of the relevance of deep learning in providing assistive intelligence to medical practitioners, when they are overburdened with patients as in pandemics. Its clinical significance lies in its unprecedented scope in providing low-cost decision-making for patients lacking specialized healthcare at remote locations.

translated by 谷歌翻译

Analysis and application of multispectral data for water segmentation using machine learning

Shubham Gupta , Uma D. , Ramachandra Hebbar

分类：计算机视觉

2022-12-16

Monitoring water is a complex task due to its dynamic nature, added pollutants, and land build-up. The availability of high-resolu-tion data by Sentinel-2 multispectral products makes implementing remote sensing applications feasible. However, overutilizing or underutilizing multispectral bands of the product can lead to inferior performance. In this work, we compare the performances of ten out of the thirteen bands available in a Sentinel-2 product for water segmentation using eight machine learning algorithms. We find that the shortwave infrared bands (B11 and B12) are the most superior for segmenting water bodies. B11 achieves an overall accuracy of $71\%$ while B12 achieves $69\%$ across all algorithms on the test site. We also find that the Support Vector Machine (SVM) algorithm is the most favourable for single-band water segmentation. The SVM achieves an overall accuracy of $69\%$ across the tested bands over the given test site. Finally, to demonstrate the effectiveness of choosing the right amount of data, we use only B11 reflectance data to train an artificial neural network, BandNet. Even with a basic architecture, BandNet is proportionate to known architectures for semantic and water segmentation, achieving a $92.47$ mIOU on the test site. BandNet requires only a fraction of the time and resources to train and run inference, making it suitable to be deployed on web applications to run and monitor water bodies in localized regions. Our codebase is available at https://github.com/IamShubhamGupto/BandNet.

translated by 谷歌翻译

ORCa: Glossy Objects as Radiance Field Cameras

Kushagra Tiwary , Askhat Dave , Nikhil Behari , Tzofi Klinghoffer , Ashok Veeraraghavan , Ramesh Raskar

分类：计算机视觉 | 人工智能

2022-12-08

Reflections on glossy objects contain valuable and hidden information about the surrounding environment. By converting these objects into cameras, we can unlock exciting applications, including imaging beyond the camera's field-of-view and from seemingly impossible vantage points, e.g. from reflections on the human eye. However, this task is challenging because reflections depend jointly on object geometry, material properties, the 3D environment, and the observer viewing direction. Our approach converts glossy objects with unknown geometry into radiance-field cameras to image the world from the object's perspective. Our key insight is to convert the object surface into a virtual sensor that captures cast reflections as a 2D projection of the 5D environment radiance field visible to the object. We show that recovering the environment radiance fields enables depth and radiance estimation from the object to its surroundings in addition to beyond field-of-view novel-view synthesis, i.e. rendering of novel views that are only directly-visible to the glossy object present in the scene, but not the observer. Moreover, using the radiance field we can image around occluders caused by close-by objects in the scene. Our method is trained end-to-end on multi-view images of the object and jointly estimates object geometry, diffuse radiance, and the 5D environment radiance field.

translated by 谷歌翻译

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

Benjamin Kiefer , Matej Kristan , Janez Perš , Lojze Žust , Fabio Poiesi , Fabio Augusto de Alcantara Andrade , Alexandre Bernardino , Matthew Dawkins , Jenni Raitoharju , Yitong Quan

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2022-11-24

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.

translated by 谷歌翻译

Programmable and Customized Intelligence for Traffic Steering in 5G Networks Using Open RAN Architectures

Andrea Lacava , Michele Polese , Rajarajan Sivaraj , Rahul Soundrarajan , Bhawani Shanker Bhati , Tarunjeet Singh , Tommaso Zugno , Francesca Cuomo , Tommaso Melodia

分类：人工智能

2022-09-28

5G及以后的移动网络将以前所未有的规模支持异质用例，从而要求自动控制和优化针对单个用户需求的网络功能。当前的蜂窝体系结构不可能对无线电访问网络（RAN）进行这种细粒度控制。为了填补这一空白，开放式运行范式及其规范引入了一个带有抽象的开放体系结构，该架构可以启用闭环控制并提供数据驱动和智能优化RAN在用户级别上。这是通过在网络边缘部署在近实时RAN智能控制器（接近RT RIC）上的自定义RAN控制应用程序（即XAPP）获得的。尽管有这些前提，但截至今天，研究界缺乏用于构建数据驱动XAPP的沙箱，并创建大型数据集以有效的AI培训。在本文中，我们通过引入NS-O-RAN来解决此问题，NS-O-RAN是一个软件框架，该框架将现实世界中的生产级近距离RIC与NS-3上的基于3GPP的模拟环境集成在一起，从而实现了XAPPS和XAPPS的开发自动化的大规模数据收集和深入强化学习驱动的控制策略的测试，以在用户级别的优化中进行优化。此外，我们提出了第一个特定于用户的O-RAN交通转向（TS）智能移交框架。它使用随机的合奏混合物，结合了最先进的卷积神经网络体系结构，以最佳地为网络中的每个用户分配服务基站。我们的TS XAPP接受了NS-O-RAN收集的超过4000万个数据点的培训，该数据点在近距离RIC上运行，并控制其基站。我们在大规模部署中评估了性能，这表明基于XAPP的交换可以使吞吐量和频谱效率平均比传统的移交启发式方法提高50％，而动机性开销较少。

translated by 谷歌翻译

Comparing Methods for Extractive Summarization of Call Centre Dialogue

Alexandra N. Uma , Dmitry Sityaev

分类：自然语言处理 | 人工智能

2022-09-06

本文提供了评估一些文本摘要技术的结果，目的是为联系中心解决方案生产呼叫摘要。我们特别关注提取性摘要方法，因为它们不需要任何标记的数据，并且非常易于实施生产使用。我们通过使用这些方法来比较几种此类方法来对呼叫的摘要进行比较，并客观地（使用Rouge-L）和主观（通过汇总几个注释者的判断）来评估这些摘要。我们发现主题和铅-N的表现优于其他摘要方法，而Bertsum在主观和客观评估中的得分相对较低。结果表明，即使是基于启发式方法的方法，例如Lead-n Ca n也会产生有意义且有用的呼叫中心对话摘要。

translated by 谷歌翻译

Introducing dynamical constraints into representation learning

Dedi Wang , Yihang Wang , Luke Evans , Pratyush Tiwary

分类：机器学习

2022-09-02

尽管表示学习对于机器学习和人工智能的兴起至关重要，但仍有一个关键问题在使学习的表示有意义。为此，典型的方法是通过先前的概率分布正规化学习的表示形式。但是，这样的先验通常不可用或临时。为了解决这个问题，我们提出了一个动态约束的表示学习框架。我们不使用预定义的概率，而是将潜在表示限制为遵循特定的动力学，这是在动态系统中的表示形式学习的更自然的约束。我们的信念源于物理学的基本观察，尽管不同的系统可以具有不同的边缘化概率分布，但它们通常遵守相同的动态，例如牛顿和施罗宾格的方程。我们验证了不同系统的框架，包括真实的荧光DNA电影数据集。我们表明，我们的算法可以唯一识别不相关的，等距和有意义的潜在表示。

translated by 谷歌翻译

HTML版本

DCNNV-19: Uma rede neural convolucional profunda para detecção de COVID-19 em tomografias computadorizadas torácicas

Victor Felipe Reis-Silva

分类：计算机视觉

2022-08-18

该技术报告建议将深卷卷神经网络用作初步的诊断方法，用于分析来自严重急性呼吸系统症状（SARS）症状的胸部计算机断层扫描图像（SARS）和怀疑的Covid-19疾病，尤其是在延迟时在RT-PCR结果和缺乏紧急护理的情况下，可能会导致严重的暂时，长期或永久性健康损害。该模型接受了83,391张图像的培训，并在15,297张验证，并在22,185个数字上进行了测试，在Cohen's Kappa中获得了98％的F1分数，准确性98.4％，损失为5.09％。与当前的金色标准检查，实时反向转录酶聚合酶链反应（RT-PCR）相比，证明高度准确的自动分类并提供的时间更少。 - o存在相关性\'orio t \'ecnico prop \ 〜oe a fituiliza \ c {c} \ 〜ao de uma de uma de uma de uma de uma de uma de uma rede refolucional refolucional profunda como m \'etodo' tomografia computadorizada tor \'accica em pacientes com sintomas de s \'indrome respirat \'oria aguda grave（srag） ^encia de cuidados ungratees poderia acartar graves danos temer \'arios，\`longo prazo，ou permanentes \ a a sa \'ude。 o Modelo Foi Treinado EM 83.391成像，VILEDADO EM 15.297，E TESTADO EM 22.185 FIGURAS，ATINGINDO PONTUA \ C {C} \ 〜AO no F1-SCORE DE 98％，97,59％EM COHEN KAPPA，98,4％DEACUR，98,4％DEACUR \'acia e 5,09％损失。 atestando uma classifica \ c {c} \ 〜ao aumatizada r \'apida e de alta precis \ 〜ao，e fornecendo resuldo exultado em tempo menor ao ao do exame padr \ 〜Ao-ao-outo atual，o实时反向转移酶聚合酶链链反应（RT-PCR）。

translated by 谷歌翻译

Thermodynamics of Interpretation

Shams Mehdi , Pratyush Tiwary

分类：机器学习

2022-06-27

在过去的几年中，不同类型的数据驱动的人工智能（AI）技术已在科学的各个领域广泛采用，用于生成预测的黑盒模型。但是，由于其黑框的性质，在接受这些模型之前对这些模型建立信任至关重要。实现这一目标的一种方法是实施事后解释方案，该方案可以提出黑框模型预测背后的原因。在这项工作中，我们为此目的提出了一种经典的热力学启发方法：AI和其他黑盒范式（TERP）的热力学解释表示。 TERP通过构建线性的局部替代模型来起作用，该模型在所解释的实例周围的小社区中近似黑框模型的行为。通过采用简单的前向特征选择蒙特卡洛算法，TERP为所有可能的替代模型分配了解释性自由能评分，以选择最佳解释。此外，我们通过成功解释来自来自相关领域的数据集的四种不同类别的黑盒模型，将TERP验证为一种通常适用的方法，包括对图像进行分类，预测心脏病和分类生物分子构象。

translated by 谷歌翻译