智能论文笔记

Open Arms: Open-Source Arms, Hands & Control

David Hanson , Alishba Imran , Gerardo Morales , Vytas Krisciunas , Aditya Sagi , Aman Malali , Rushali Mohbe , Raviteja Upadrashta

分类：机器人 | 人工智能

2022-05-20

Open Arms是一个新型的开源平台，该平台具有现实的人类机器人手和手臂硬件，并具有28个自由度（DOF），旨在扩展人形机器人抓握和操纵的能力和可访问性。敞开的武器框架包括开放的SDK和开发环境，仿真工具和应用程序开发工具，以构建和操作敞开的武器。本文描述了这些手控制，感应，机制，美学设计以及制造业及其现实世界的应用，并使用远程手工护理机器人进行了现实应用。从2015年到2022年，作者设计并确定了敞开的武器的制造作为低成本，高功能机器人手臂硬件和软件框架，以服务类人机器人的机器人应用以及对低成本假肢的紧急需求，作为一部分汉森机器人索菲亚机器人平台。使用消费产品制造的技术，我们着手定义模块化的低成本技术，以近似人类手的灵敏性和灵敏度。为了证明我们的手的敏捷性和控制，我们提出了一种生成握把残留的CNN（GGR-CNN）模型，该模型可以从实时速度（22ms）的各种对象的输入图像中生成强大的抗抑制剂。我们使用在标准的康奈尔（Cornell）握把数据集上使用模型体系结构实现了92.4％的最新准确性，该数据集包含各种各样的家庭对象。

translated by 谷歌翻译

Hand-breathe: Non-Contact Monitoring of Breathing Abnormalities from Hand Palm

Kawish Pervez , Waqas Aman , M. Mahboob Ur Rahman , M. Wasim Nawaz , Qammer H. Abbasi

分类：机器学习

2022-12-12

In post-covid19 world, radio frequency (RF)-based non-contact methods, e.g., software-defined radios (SDR)-based methods have emerged as promising candidates for intelligent remote sensing of human vitals, and could help in containment of contagious viruses like covid19. To this end, this work utilizes the universal software radio peripherals (USRP)-based SDRs along with classical machine learning (ML) methods to design a non-contact method to monitor different breathing abnormalities. Under our proposed method, a subject rests his/her hand on a table in between the transmit and receive antennas, while an orthogonal frequency division multiplexing (OFDM) signal passes through the hand. Subsequently, the receiver extracts the channel frequency response (basically, fine-grained wireless channel state information), and feeds it to various ML algorithms which eventually classify between different breathing abnormalities. Among all classifiers, linear SVM classifier resulted in a maximum accuracy of 88.1\%. To train the ML classifiers in a supervised manner, data was collected by doing real-time experiments on 4 subjects in a lab environment. For label generation purpose, the breathing of the subjects was classified into three classes: normal, fast, and slow breathing. Furthermore, in addition to our proposed method (where only a hand is exposed to RF signals), we also implemented and tested the state-of-the-art method (where full chest is exposed to RF radiation). The performance comparison of the two methods reveals a trade-off, i.e., the accuracy of our proposed method is slightly inferior but our method results in minimal body exposure to RF radiation, compared to the benchmark method.

translated by 谷歌翻译

Explain to me like I am five -- Sentence Simplification Using Transformers

Aman Agarwal

分类：自然语言处理

2022-12-08

Sentence simplification aims at making the structure of text easier to read and understand while maintaining its original meaning. This can be helpful for people with disabilities, new language learners, or those with low literacy. Simplification often involves removing difficult words and rephrasing the sentence. Previous research have focused on tackling this task by either using external linguistic databases for simplification or by using control tokens for desired fine-tuning of sentences. However, in this paper we purely use pre-trained transformer models. We experiment with a combination of GPT-2 and BERT models, achieving the best SARI score of 46.80 on the Mechanical Turk dataset, which is significantly better than previous state-of-the-art results. The code can be found at https://github.com/amanbasu/sentence-simplification.

translated by 谷歌翻译

Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization

Kayhan Behdin , Qingquan Song , Aman Gupta , David Durfee , Ayan Acharya , Sathiya Keerthi , Rahul Mazumder

分类：机器学习

2022-12-07

Modern deep learning models are over-parameterized, where the optimization setup strongly affects the generalization performance. A key element of reliable optimization for these systems is the modification of the loss function. Sharpness-Aware Minimization (SAM) modifies the underlying loss function to guide descent methods towards flatter minima, which arguably have better generalization abilities. In this paper, we focus on a variant of SAM known as mSAM, which, during training, averages the updates generated by adversarial perturbations across several disjoint shards of a mini-batch. Recent work suggests that mSAM can outperform SAM in terms of test accuracy. However, a comprehensive empirical study of mSAM is missing from the literature -- previous results have mostly been limited to specific architectures and datasets. To that end, this paper presents a thorough empirical evaluation of mSAM on various tasks and datasets. We provide a flexible implementation of mSAM and compare the generalization performance of mSAM to the performance of SAM and vanilla training on different image classification and natural language processing tasks. We also conduct careful experiments to understand the computational cost of training with mSAM, its sensitivity to hyperparameters and its correlation with the flatness of the loss landscape. Our analysis reveals that mSAM yields superior generalization performance and flatter minima, compared to SAM, across a wide range of tasks without significantly increasing computational costs.

translated by 谷歌翻译

Double U-Net for Super-Resolution and Segmentation of Live Cell Images

Mayur Bhandary , J. Patricio Reyes , Eylul Ertay , Aman Panda

分类：计算机视觉

2022-12-05

Accurate segmentation of live cell images has broad applications in clinical and research contexts. Deep learning methods have been able to perform cell segmentations with high accuracy; however developing machine learning models to do this requires access to high fidelity images of live cells. This is often not available due to resource constraints like limited accessibility to high performance microscopes or due to the nature of the studied organisms. Segmentation on low resolution images of live cells is a difficult task. This paper proposes a method to perform live cell segmentation with low resolution images by performing super-resolution as a pre-processing step in the segmentation pipeline.

translated by 谷歌翻译

Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula

Eli Bronstein , Sirish Srinivasan , Supratik Paul , Aman Sinha , Matthew O'Kelly , Payam Nikdel , Shimon Whiteson

分类：机器人 | 人工智能 | 机器学习

2022-12-02

ML-based motion planning is a promising approach to produce agents that exhibit complex behaviors, and automatically adapt to novel environments. In the context of autonomous driving, it is common to treat all available training data equally. However, this approach produces agents that do not perform robustly in safety-critical settings, an issue that cannot be addressed by simply adding more data to the training set - we show that an agent trained using only a 10% subset of the data performs just as well as an agent trained on the entire dataset. We present a method to predict the inherent difficulty of a driving situation given data collected from a fleet of autonomous vehicles deployed on public roads. We then demonstrate that this difficulty score can be used in a zero-shot transfer to generate curricula for an imitation-learning based planning agent. Compared to training on the entire unbiased training dataset, we show that prioritizing difficult driving scenarios both reduces collisions by 15% and increases route adherence by 14% in closed-loop evaluation, all while using only 10% of the training data.

translated by 谷歌翻译

Named Entity Recognition in Indian court judgments

Prathamesh Kalamkar , Astha Agarwal , Aman Tiwari , Smita Gupta , Saurabh Karn , Vivek Raghavan

分类：自然语言处理 | 人工智能

2022-11-07

Identification of named entities from legal texts is an essential building block for developing other legal Artificial Intelligence applications. Named Entities in legal texts are slightly different and more fine-grained than commonly used named entities like Person, Organization, Location etc. In this paper, we introduce a new corpus of 46545 annotated legal named entities mapped to 14 legal entity types. The Baseline model for extracting legal named entities from judgment text is also developed.

translated by 谷歌翻译

Language Models of Code are Few-Shot Commonsense Learners

Aman Madaan , Shuyan Zhou , Uri Alon , Yiming Yang , Graham Neubig

分类：自然语言处理 | 机器学习

2022-10-13

We address the general task of structured commonsense reasoning: given a natural language input, the goal is to generate a graph such as an event -- or a reasoning-graph. To employ large language models (LMs) for this task, existing approaches ``serialize'' the output graph as a flat list of nodes and edges. Although feasible, these serialized graphs strongly deviate from the natural language corpora that LMs were pre-trained on, hindering LMs from generating them correctly. In this paper, we show that when we instead frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsense reasoners than LMs of natural language, even when the downstream task does not involve source code at all. We demonstrate our approach across three diverse structured commonsense reasoning tasks. In all these natural language tasks, we show that using our approach, a code generation LM (CODEX) outperforms natural-LMs that are fine-tuned on the target task (e.g., T5) and other strong LMs such as GPT-3 in the few-shot setting.

translated by 谷歌翻译

3D Reconstruction using Structured Light from off-the-shelf components

Aman Gajendra Jain , Dr. Shital Chiddarwar

分类：计算机视觉

2022-09-24

坐标测量机（CMM）一直是测量近50年或更长时间以上固体物体的准确性的基准。然而，随着3D扫描技术的出现，产生的点云的准确性和密度已接管。在这个项目中，我们不仅比较可在3D扫描软件中使用的不同算法，而且还比较了从相机和投影仪等现成组件中创建自己的3D扫描仪。我们的目标是：1。为3D扫描仪开发一个原型，以实现在对象的广泛类型上以最佳精度执行的系统。2.使用现成的组件最小化成本。3.到达非常接近CMM的准确性。

translated by 谷歌翻译

Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango

Aman Madaan , Amir Yazdanbakhsh

分类：自然语言处理 | 人工智能 | 机器学习

2022-09-16

推理是人类认知和智力的关键支柱。在过去的十年中，我们目睹了自然语言处理的巨大收益和大型语言模型的前所未有的缩放。最近的工作表征了很少射击技术的能力，例如思想链，可以在大语言模型中模仿人类的推理。这个标志性的功能很少，连同不断扩展的语言模型相结合，打开了解决各种任务的可能性的远景，例如数学单词问题，代码完成和常识性推理。促使思想链（COT）通过提供中间步骤并敦促模型遵循相同的过程，从而进一步推动了模型的性能。尽管具有令人信服的性能，但在这些模型中推理能力的起源却很少探索。这项工作启动了对大语言模型中推理机制的更深入了解的初步步骤。我们的工作围绕查询模型，同时在提示中控制除一个组件以外的所有组件外：符号，模式和文本。然后，我们分析查询之间的性能差异。我们的结果表明，在提示中存在事实模式对于COT的成功并不是必需的。尽管如此，我们从经验上表明，仅依靠模式也不足以获得高质量的结果。我们认为文本具有常识性知识和意义。我们详尽的经验分析提供了定性的例子，说明了文本和模式之间的共生关系。这种对COT的系统理解使我们能够设计简洁的思想链，被称为CCOT，在其中修剪文本和模式只能保留其关键角色，同时以PAR或更高的求解任务率交付。

translated by 谷歌翻译