智能论文笔记

NeuCASL: From Logic Design to System Simulation of Neuromorphic Engines

Dharanidhar Dang , Amitash Nanda , Bill Lin , Debashis Sahoo

分类：人工智能

2022-08-06

随着摩尔的定律饱和和丹纳德的缩放率撞到了墙壁，传统的冯·诺伊曼系统无法为CNN等计算密集型算法提供GFLOPS/WATT。非常规计算方法的最新趋势使我们希望为此类算法设计高能节能的计算系统。神经形态计算是一种有希望的方法，其脑启发的电路，新兴技术的使用和低功率性质。研究人员使用各种新型技术，例如回忆录，硅光子学，鳍片和碳纳米管来演示神经形态计算机。但是，从神经形态的逻辑设计开始并进行建筑模拟的灵活CAD工具尚未得到证明，以支持这种有希望的范式的兴起。在这个项目中，我们旨在构建Neucasl，这是一个基于OpenSource Python的完整系统CAD框架，用于神经形态逻辑设计，电路模拟以及系统性能和可靠性估计。据我们所知，这是同类产品中的第一个。

translated by 谷歌翻译

LiteCON: An All-Photonic Neuromorphic Accelerator for Energy-efficient Deep Learning (Preprint)

Dharanidhar Dang , Bill Lin , Debashis Sahoo

分类：机器学习

2022-06-28

在当今的数据密集型时代，深度学习非常普遍。特别是，卷积神经网络（CNN）在各种领域被广泛采用，以获得卓越的准确性。但是，计算传统CPU和GPU的深入CNN带来了几种性能和能量陷阱。最近已经证明了基于ASIC，FPGA和电阻内存设备的几种新型方法，并有令人鼓舞的结果。他们中的大多数仅针对深度学习的推理（测试）阶段。尝试设计能够培训和推理的全面深度学习加速器的尝试非常有限。这是由于训练阶段的高度计算和记忆密集型性质。在本文中，我们提出了一种新型的模拟光子CNN加速器Litecon。 Litecon使用基于硅微波炉的卷积，基于备忘录的内存和密集波长 - 划分的稳定和超快深度学习。我们使用商业CAD框架（IPKISS）评估LiteCon，该框架（IPKISS）在包括Lenet和VGG-NET在内的深度学习基准模型上评估。与最先进的情况相比，LiteCon分别将CNN的吞吐量，能源效率和计算效率提高了32倍，37倍和5倍，并具有微不足道的精度降解。

translated by 谷歌翻译

Neural Collapse in Deep Linear Network: From Balanced to Imbalanced Data

Hien Dang , Tan Nguyen , Tho Tran , Hung Tran , Nhat Ho

分类：机器学习 | (统计)机器学习

2023-01-01

Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.

translated by 谷歌翻译

Uniform Sequence Better: Time Interval Aware Data Augmentation for Sequential Recommendation

Yizhou Dang , Enneng Yang , Guibing Guo , Linying Jiang , Xingwei Wang , Xiaoxiao Xu , Qinghui Sun , Hong Liu

分类：机器学习

2022-12-16

Sequential recommendation is an important task to predict the next-item to access based on a sequence of interacted items. Most existing works learn user preference as the transition pattern from the previous item to the next one, ignoring the time interval between these two items. However, we observe that the time interval in a sequence may vary significantly different, and thus result in the ineffectiveness of user modeling due to the issue of \emph{preference drift}. In fact, we conducted an empirical study to validate this observation, and found that a sequence with uniformly distributed time interval (denoted as uniform sequence) is more beneficial for performance improvement than that with greatly varying time interval. Therefore, we propose to augment sequence data from the perspective of time interval, which is not studied in the literature. Specifically, we design five operators (Ti-Crop, Ti-Reorder, Ti-Mask, Ti-Substitute, Ti-Insert) to transform the original non-uniform sequence to uniform sequence with the consideration of variance of time intervals. Then, we devise a control strategy to execute data augmentation on item sequences in different lengths. Finally, we implement these improvements on a state-of-the-art model CoSeRec and validate our approach on four real datasets. The experimental results show that our approach reaches significantly better performance than the other 11 competing methods. Our implementation is available: https://github.com/KingGugu/TiCoSeRec.

translated by 谷歌翻译

AUC Maximization for Low-Resource Named Entity Recognition

Ngoc Dang Nguyen , Wei Tan , Wray Buntine , Richard Beare , Changyou Chen , Lan Du

分类：自然语言处理 | 机器学习

2022-12-09

Current work in named entity recognition (NER) uses either cross entropy (CE) or conditional random fields (CRF) as the objective/loss functions to optimize the underlying NER model. Both of these traditional objective functions for the NER problem generally produce adequate performance when the data distribution is balanced and there are sufficient annotated training examples. But since NER is inherently an imbalanced tagging problem, the model performance under the low-resource settings could suffer using these standard objective functions. Based on recent advances in area under the ROC curve (AUC) maximization, we propose to optimize the NER model by maximizing the AUC score. We give evidence that by simply combining two binary-classifiers that maximize the AUC score, significant performance improvement over traditional loss functions is achieved under low-resource NER settings. We also conduct extensive experiments to demonstrate the advantages of our method under the low-resource and highly-imbalanced data distribution settings. To the best of our knowledge, this is the first work that brings AUC maximization to the NER setting. Furthermore, we show that our method is agnostic to different types of NER embeddings, models and domains. The code to replicate this work will be provided upon request.

translated by 谷歌翻译

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

Jinze Bai , Rui Men , Hao Yang , Xuancheng Ren , Kai Dang , Yichang Zhang , Xiaohuan Zhou , Peng Wang , Sinan Tan , An Yang

分类：计算机视觉 | 人工智能 | 自然语言处理 | 机器学习

2022-12-08

Generalist models, which are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model, have been explored recently. Being, hopefully, an alternative to approaching general-purpose AI, existing generalist models are still at an early stage, where modality and task coverage is limited. To empower multi-modal task-scaling and speed up this line of research, we release a generalist model learning system, OFASys, built on top of a declarative task interface named multi-modal instruction. At the core of OFASys is the idea of decoupling multi-modal task representations from the underlying model implementations. In OFASys, a task involving multiple modalities can be defined declaratively even with just a single line of code. The system automatically generates task plans from such instructions for training and inference. It also facilitates multi-task training for diverse multi-modal workloads. As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data. The single OFA+ model achieves 95% performance in average with only 16% parameters of 15 task-finetuned models, showcasing the performance reliability of multi-modal task-scaling provided by OFASys. Available at https://github.com/OFA-Sys/OFASys

translated by 谷歌翻译

MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation

Yanjie Fu , Haoran Yin , Meng Ge , Longbiao Wang , Gaoyan Zhang , Jianwu Dang , Chengyun Deng , Fei Wang

分类：机器学习

2022-12-07

Recently, many deep learning based beamformers have been proposed for multi-channel speech separation. Nevertheless, most of them rely on extra cues known in advance, such as speaker feature, face image or directional information. In this paper, we propose an end-to-end beamforming network for direction guided speech separation given merely the mixture signal, namely MIMO-DBnet. Specifically, we design a multi-channel input and multiple outputs architecture to predict the direction-of-arrival based embeddings and beamforming weights for each source. The precisely estimated directional embedding provides quite effective spatial discrimination guidance for the neural beamformer to offset the effect of phase wrapping, thus allowing more accurate reconstruction of two sources' speech signals. Experiments show that our proposed MIMO-DBnet not only achieves a comprehensive decent improvement compared to baseline systems, but also maintain the performance on high frequency bands when phase wrapping occurs.

translated by 谷歌翻译

SODA: A Natural Language Processing Package to Extract Social Determinants of Health for Cancer Studies

Zehao Yu , Xi Yang , Chong Dang , Prakash Adekkanattu , Braja Gopal Patra , Yifan Peng , Jyotishman Pathak , Debbie L. Wilson , Ching-Yuan Chang , Wei-Hsuan Lo-Ciganic

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-06

Objective: We aim to develop an open-source natural language processing (NLP) package, SODA (i.e., SOcial DeterminAnts), with pre-trained transformer models to extract social determinants of health (SDoH) for cancer patients, examine the generalizability of SODA to a new disease domain (i.e., opioid use), and evaluate the extraction rate of SDoH using cancer populations. Methods: We identified SDoH categories and attributes and developed an SDoH corpus using clinical notes from a general cancer cohort. We compared four transformer-based NLP models to extract SDoH, examined the generalizability of NLP models to a cohort of patients prescribed with opioids, and explored customization strategies to improve performance. We applied the best NLP model to extract 19 categories of SDoH from the breast (n=7,971), lung (n=11,804), and colorectal cancer (n=6,240) cohorts. Results and Conclusion: We developed a corpus of 629 cancer patients notes with annotations of 13,193 SDoH concepts/attributes from 19 categories of SDoH. The Bidirectional Encoder Representations from Transformers (BERT) model achieved the best strict/lenient F1 scores of 0.9216 and 0.9441 for SDoH concept extraction, 0.9617 and 0.9626 for linking attributes to SDoH concepts. Fine-tuning the NLP models using new annotations from opioid use patients improved the strict/lenient F1 scores from 0.8172/0.8502 to 0.8312/0.8679. The extraction rates among 19 categories of SDoH varied greatly, where 10 SDoH could be extracted from >70% of cancer patients, but 9 SDoH had a low extraction rate (<70% of cancer patients). The SODA package with pre-trained transformer models is publicly available at https://github.com/uf-hobiinformatics-lab/SDoH_SODA.

translated by 谷歌翻译

AI-driven Mobile Apps: an Explorative Study

Yinghua Li , Xueqi Dang , Haoye Tian , Tiezhu Sun , Zhijie Wang , Lei Ma , Jacques Klein , Tegawende F. Bissyande

分类：人工智能

2022-12-03

Recent years have witnessed an astonishing explosion in the evolution of mobile applications powered by AI technologies. The rapid growth of AI frameworks enables the transition of AI technologies to mobile devices, significantly prompting the adoption of AI apps (i.e., apps that integrate AI into their functions) among smartphone devices. In this paper, we conduct the most extensive empirical study on 56,682 published AI apps from three perspectives: dataset characteristics, development issues, and user feedback and privacy. To this end, we build an automated AI app identification tool, AI Discriminator, that detects eligible AI apps from 7,259,232 mobile apps. First, we carry out a dataset analysis, where we explore the AndroZoo large repository to identify AI apps and their core characteristics. Subsequently, we pinpoint key issues in AI app development (e.g., model protection). Finally, we focus on user reviews and user privacy protection. Our paper provides several notable findings. Some essential ones involve revealing the issue of insufficient model protection by presenting the lack of model encryption, and demonstrating the risk of user privacy data being leaked. We published our large-scale AI app datasets to inspire more future research.

translated by 谷歌翻译

Disentangled Generation with Information Bottleneck for Few-Shot Learning

Zhuohang Dang , Jihong Wang , Minnan Luo , Chengyou Jia , Caixia Yan , Qinghua Zheng

分类：计算机视觉

2022-11-29

Few-shot learning (FSL), which aims to classify unseen classes with few samples, is challenging due to data scarcity. Although various generative methods have been explored for FSL, the entangled generation process of these methods exacerbates the distribution shift in FSL, thus greatly limiting the quality of generated samples. To these challenges, we propose a novel Information Bottleneck (IB) based Disentangled Generation Framework for FSL, termed as DisGenIB, that can simultaneously guarantee the discrimination and diversity of generated samples. Specifically, we formulate a novel framework with information bottleneck that applies for both disentangled representation learning and sample generation. Different from existing IB-based methods that can hardly exploit priors, we demonstrate our DisGenIB can effectively utilize priors to further facilitate disentanglement. We further prove in theory that some previous generative and disentanglement methods are special cases of our DisGenIB, which demonstrates the generality of the proposed DisGenIB. Extensive experiments on challenging FSL benchmarks confirm the effectiveness and superiority of DisGenIB, together with the validity of our theoretical analyses. Our codes will be open-source upon acceptance.

translated by 谷歌翻译