智能论文笔记

Timestamp-Supervised Action Segmentation with Graph Convolutional Networks

Hamza Khan , Sanjay Haresh , Awais Ahmed , Shakeeb Siddiqui , Andrey Konin , M. Zeeshan Zia , Quoc-Huy Tran

分类：计算机视觉

2022-06-30

我们介绍了一种新颖的方法，用于使用时间戳监督进行时间戳分割。我们的主要贡献是图形卷积网络，该网络以端到端方式学习，以利用相邻帧之间的帧功能和连接，以从稀疏的时间戳标签中生成密集的框架标签。然后可以使用生成的密集框架标签来训练分割模型。此外，我们为分割模型和图形卷积模型进行交替学习的框架，该模型首先初始化，然后迭代地完善学习模型。在四个公共数据集上进行了详细的实验，包括50种沙拉，GTEA，早餐和桌面组件，表明我们的方法优于多层感知器基线，同时在时间活动中表现出色或更好地表现出色或更好在时间戳监督下。

translated by 谷歌翻译

Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering

Sateesh Kumar , Sanjay Haresh , Awais Ahmed , Andrey Konin , M. Zeeshan Zia , Quoc-Huy Tran

分类：计算机视觉

2021-05-27

我们为无监督活动分割提出了一种新方法，它使用视频帧聚类作为借口任务，并同时执行表示学习和在线群集。这与先前作品相反，其中通常顺序地执行表示学习和聚类。我们通过采用时间最优运输来利用视频中的时间信息。特别是，我们纳入了一个时间正则化术语，其将活动的时间顺序保留到用于计算伪标签群集分配的标准最佳传输模块中。时间最优传输模块使我们的方法能够学习无监督活动细分的有效陈述。此外，先前的方法需要在以离线方式培养它们之前对整个数据集的学习功能存储在整个数据集中，而我们的方法在在线方式一次处理一个迷你批次。在三个公共数据集，即50沙拉，YouTube说明和早餐以及我们的数据集，即桌面装配的广泛评估表明，我们的方法在PAR或更优于以前的无监督活动分割方法，尽管内存限制显着较低。

translated by 谷歌翻译

Domain-Specific Priors and Meta Learning for Few-Shot First-Person Action Recognition

Huseyin Coskun , Zeeshan Zia , Bugra Tekin , Federica Bogo , Nassir Navab , Federico Tombari , Harpreet Sawhney

分类：计算机视觉

2019-07-22

具有注释的缺乏大规模的真实数据集使转移学习视频活动的必要性。我们的目标是为少数行动分类开发几次拍摄转移学习的有效方法。我们利用独立培训的本地视觉提示来学习可以从源域传输的表示，该源域只能使用少数示例来从源域传送到不同的目标域。我们使用的视觉提示包括对象 - 对象交互，手掌和地区内的动作，这些地区是手工位置的函数。我们采用了一个基于元学习的框架，以提取部署的视觉提示的独特和域不变组件。这使得能够在使用不同的场景和动作配置捕获的公共数据集中传输动作分类模型。我们呈现了我们转让学习方法的比较结果，并报告了阶级阶级和数据间数据间际传输的最先进的行动分类方法。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Objective Surgical Skills Assessment and Tool Localization: Results from the MICCAI 2021 SimSurgSkill Challenge

Aneeq Zia , Kiran Bhattacharyya , Xi Liu , Ziheng Wang , Max Berniker , Satoshi Kondo , Emanuele Colleoni , Dimitris Psychogyios , Yueming Jin , Jinfan Zhou

分类：计算机视觉

2022-12-08

Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from https://console.cloud.google.com/storage/browser/isi-simsurgskill-2021.

translated by 谷歌翻译

Nostradamus: Weathering Worth

Alapan Chaudhuri , Zeeshan Ahmed , Ashwin Rao , Shivansh Subramanian , Shreyas Pradhan , Abhishek Mittal

分类：机器学习

2022-12-08

Nostradamus, inspired by the French astrologer and reputed seer, is a detailed study exploring relations between environmental factors and changes in the stock market. In this paper, we analyze associative correlation and causation between environmental elements and stock prices based on the US financial market, global climate trends, and daily weather records to demonstrate significant relationships between climate and stock price fluctuation. Our analysis covers short and long-term rises and dips in company stock performances. Lastly, we take four natural disasters as a case study to observe their effect on the emotional state of people and their influence on the stock market.

translated by 谷歌翻译

SafeSpace MFNet: Precise and Efficient MultiFeature Drone Detection Network

Mahnoor Dil , Misha Urooj Khan , Muhammad Zeshan Alam , Farooq Alam Orakazi , Zeeshan Kaleem , Chau Yuen

分类：计算机视觉

2022-11-30

Unmanned air vehicles (UAVs) popularity is on the rise as it enables the services like traffic monitoring, emergency communications, deliveries, and surveillance. However, the unauthorized usage of UAVs (a.k.a drone) may violate security and privacy protocols for security-sensitive national and international institutions. The presented challenges require fast, efficient, and precise detection of UAVs irrespective of harsh weather conditions, the presence of different objects, and their size to enable SafeSpace. Recently, there has been significant progress in using the latest deep learning models, but those models have shortcomings in terms of computational complexity, precision, and non-scalability. To overcome these limitations, we propose a precise and efficient multiscale and multifeature UAV detection network for SafeSpace, i.e., \textit{MultiFeatureNet} (\textit{MFNet}), an improved version of the popular object detection algorithm YOLOv5s. In \textit{MFNet}, we perform multiple changes in the backbone and neck of the YOLOv5s network to focus on the various small and ignored features required for accurate and fast UAV detection. To further improve the accuracy and focus on the specific situation and multiscale UAVs, we classify the \textit{MFNet} into small (S), medium (M), and large (L): these are the combinations of various size filters in the convolution and the bottleneckCSP layers, reside in the backbone and neck of the architecture. This classification helps to overcome the computational cost by training the model on a specific feature map rather than all the features. The dataset and code are available as an open source: github.com/ZeeshanKaleem/MultiFeatureNet.

translated by 谷歌翻译

TF-Net: Deep Learning Empowered Tiny Feature Network for Night-time UAV Detection

Maham Misbah , Misha Urooj Khan , Zhaohui Yang , Zeeshan Kaleem

分类：计算机视觉

2022-11-29

Technological advancements have normalized the usage of unmanned aerial vehicles (UAVs) in every sector, spanning from military to commercial but they also pose serious security concerns due to their enhanced functionalities and easy access to private and highly secured areas. Several instances related to UAVs have raised security concerns, leading to UAV detection research studies. Visual techniques are widely adopted for UAV detection, but they perform poorly at night, in complex backgrounds, and in adverse weather conditions. Therefore, a robust night vision-based drone detection system is required to that could efficiently tackle this problem. Infrared cameras are increasingly used for nighttime surveillance due to their wide applications in night vision equipment. This paper uses a deep learning-based TinyFeatureNet (TF-Net), which is an improved version of YOLOv5s, to accurately detect UAVs during the night using infrared (IR) images. In the proposed TF-Net, we introduce architectural changes in the neck and backbone of the YOLOv5s. We also simulated four different YOLOv5 models (s,m,n,l) and proposed TF-Net for a fair comparison. The results showed better performance for the proposed TF-Net in terms of precision, IoU, GFLOPS, model size, and FPS compared to the YOLOv5s. TF-Net yielded the best results with 95.7\% precision, 84\% mAp, and 44.8\% $IoU$.

translated by 谷歌翻译

A canonical correlation-based framework for performance analysis of radio access networks

Furqan Ahmed , Muhammad Zeeshan Asghar , Jyri Hämäläinen

分类：人工智能

2022-09-29

数据驱动的优化和基于机器学习的无线电访问网络的性能诊断不仅需要源于基本数据源的性质，而且还归因于复杂的时空关系以及由于用户移动性和不同流量模式而引起的单元格之间的相互依赖性。我们讨论如何使用多元分析来研究这些配置和性能管理数据集以及在关键性能指标方面识别细胞之间的关系。为此，我们利用了基于规范相关分析（CCA）的新框架，这不仅是降低维度的高效方法，而且还用于分析跨不同多元数据集的关系。作为一个案例研究，我们讨论了基于商业蜂窝网络中细胞关闭的节能用例，在该案例中，我们将CCA应用于分析容量细胞关闭对同一部门覆盖电池KPI的影响。来自LTE网络的数据用于分析示例案例。我们得出的结论是，CCA是一种可行的方法，用于识别网络计划和配置数据之间的关键关系，还可以动态绩效数据，为诸如降低维度降低，绩效分析和性能诊断的根本原因分析等努力铺平道路。

translated by 谷歌翻译

Learning Citywide Patterns of Life from Trajectory Monitoring

Mark Tenzer , Zeeshan Rasheed , Khurram Shafique

分类：机器学习 | 神经与进化计算

2022-06-30

现实世界中人类流动性数据集的最新扩散促进了轨迹预测，需求预测，旅行时间估计和异常检测方面的地理空间和运输研究。但是，这些数据集还可以更广泛地对复杂的人类流动系统进行描述性分析。我们正式将生命分析模式定义为在线无监督异常检测的自然，可解释的扩展，我们不仅监视数据流的异常数据流，而且随着时间的推移会明确提取正常模式。为了学习生活的模式，我们在需要时适应了（GWR）的（GWR）从计算生物学和神经机构的研究到地理空间分析的新领域。与自组织图（SOM）有关的生物学启发的神经网络，在GPS流上迭代时会逐渐构建一组“记忆”或原型流量模式。然后，它将每个新观察结果与其先前的经验进行比较，从而诱导了在线，无监督的聚类和数据的异常检测。我们从Porto出租车数据集中挖掘出利益的模式，包括主要的公共假期和新发现的运输异常，例如节日和音乐会，据我们所知，这些疾病以前尚未在先前的工作中得到认可或报道。我们预计，在许多领域，包括智能城市，自动驾驶汽车以及城市规划和管理等许多领域，可以逐步学习正常和异常的道路运输行为的能力将是有用的。

translated by 谷歌翻译