如今,提出了几种深度学习方法来应对癫痫发作预测的挑战。但是,由于其大型硬件和相应的高功率消耗,这些方法仍然无法作为可植入或有效的可穿戴设备的一部分实现。他们通常需要复杂的功能提取过程,用于存储高精度参数的大存储器和复杂的算术计算,从而大大增加了所需的硬件资源。此外,可用的预测性能差,因为它们直接从图像识别应用程序中采用网络体系结构无法准确考虑EEG信号的特征。我们在本文中提出了一个适合二进制卷积神经网络(BSDCNN)的硬件友好网络,用于癫痫发作预测。 BSDCNN利用1D卷积内核来提高预测性能。除了第一层外,所有参数均已二进制以减少所需的计算和存储。在美国癫痫社会癫痫发作预测挑战(AES)数据集和CHB-MIT方面,曲线,灵敏度和虚假预测率的总面积达到0.915、89.26%,0.117/h和0.970,94.69%,0.095/h。所提出的体系结构的表现优于最新作品,同时提供了7.2和25.5倍的参数和计算大小。
translated by 谷歌翻译
代码生成旨在从自然语言描述中自动生成代码段。通常,主流代码生成方法依赖大量的配对培训数据,包括自然语言描述和代码。但是,在某些特定领域的情况下,很难为代码生成建立如此大的配对语料库,因为没有直接可用的配对数据,并且需要大量精力来手动编写代码说明来构建高质量的培训数据集。由于培训数据有限,生成模型不能经过良好的训练,并且可能过于拟合,从而使该模型对现实世界的使用不满意。为此,在本文中,我们提出了一种任务增强方法,该方法通过扩展原始的Tranx模型来支持suptoken级代码生成,将域知识通过辅助任务和亚键入tranx模型纳入代码生成模型。为了验证我们提出的方法,我们收集了一个真实的代码生成数据集并在其上进行实验。我们的实验结果表明,亚句级Tranx模型在我们的数据集中优于原始Tranx模型和变压器模型,并且在我们的任务增强方法的帮助下,Subtoken-Tranx的确切匹配精度可显着提高12.75 \%。多个代码类别的模型性能满足了工业系统应用程序的要求。我们提出的方法已由阿里巴巴的\ emph {bizcook}平台采用。据我们所知,这是在工业开发环境中采用的第一个领域代码生成系统。
translated by 谷歌翻译
虽然某些工作尝试从UI屏幕截图中智能生成前端代码,但在Sketch中使用UI设计草稿可能更方便,这是一种流行的UI设计软件,因为我们可以直接访问多模式UI信息,例如层,位置,位置,位置,位置,位置,,,,位置,位置,位置,,位置,位置,位置,位置,,位置,位置,位置,位置,位置,,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置,位置类型大小和视觉图像。但是,如果所有这些层都参与了代码生成,则分散的层可能会降低代码质量,而不会合并为整个部分。在本文中,我们提出了一条管道,以自动合并碎片层。我们首先为UI草稿的图层树构造图表,并根据视觉特征和图形神经网络检测所有碎片层。然后,基于规则的算法旨在合并零碎的层。通过在新构建的数据集上的实验,我们的方法可以在UI设计草案中检索最碎片的层,并在检测任务中实现87%的准确性,并在简单且一般的情况下开发了后处理算法以聚集关联层。
translated by 谷歌翻译
我们提出了一种使用加权节点的联合学习方法,可以在其中修改权重以在单独的验证集上优化模型的性能。该问题被称为双重优化,其中内部问题是加权节点的联合学习问题,外部问题着重于基于从内部问题返回的模型的验证性能优化权重。沟通效率的联合优化算法旨在解决此双重优化问题。在遇到错误的假设下,我们分析了输出模型的概括性能,并识别我们的方法在理论上优于训练模型,而仅在本地训练和使用静态且均匀分布的权重进行联合学习。
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.
translated by 谷歌翻译
As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.
translated by 谷歌翻译
Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.
translated by 谷歌翻译
When using LiDAR semantic segmentation models for safety-critical applications such as autonomous driving, it is essential to understand and improve their robustness with respect to a large range of LiDAR corruptions. In this paper, we aim to comprehensively analyze the robustness of LiDAR semantic segmentation models under various corruptions. To rigorously evaluate the robustness and generalizability of current approaches, we propose a new benchmark called SemanticKITTI-C, which features 16 out-of-domain LiDAR corruptions in three groups, namely adverse weather, measurement noise and cross-device discrepancy. Then, we systematically investigate 11 LiDAR semantic segmentation models, especially spanning different input representations (e.g., point clouds, voxels, projected images, and etc.), network architectures and training schemes. Through this study, we obtain two insights: 1) We find out that the input representation plays a crucial role in robustness. Specifically, under specific corruptions, different representations perform variously. 2) Although state-of-the-art methods on LiDAR semantic segmentation achieve promising results on clean data, they are less robust when dealing with noisy data. Finally, based on the above observations, we design a robust LiDAR segmentation model (RLSeg) which greatly boosts the robustness with simple but effective modifications. It is promising that our benchmark, comprehensive analysis, and observations can boost future research in robust LiDAR semantic segmentation for safety-critical applications.
translated by 谷歌翻译
Panoptic Part Segmentation (PPS) unifies panoptic segmentation and part segmentation into one task. Previous works utilize separated approaches to handle thing, stuff, and part predictions without shared computation and task association. We aim to unify these tasks at the architectural level, designing the first end-to-end unified framework named Panoptic-PartFormer. Moreover, we find the previous metric PartPQ biases to PQ. To handle both issues, we make the following contributions: Firstly, we design a meta-architecture that decouples part feature and things/stuff feature, respectively. We model things, stuff, and parts as object queries and directly learn to optimize all three forms of prediction as a unified mask prediction and classification problem. We term our model as Panoptic-PartFormer. Secondly, we propose a new metric Part-Whole Quality (PWQ) to better measure such task from both pixel-region and part-whole perspectives. It can also decouple the error for part segmentation and panoptic segmentation. Thirdly, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross attention scheme to further boost part segmentation qualities. We design a new part-whole interaction method using masked cross attention. Finally, the extensive ablation studies and analysis demonstrate the effectiveness of both Panoptic-PartFormer and Panoptic-PartFormer++. Compared with previous Panoptic-PartFormer, our Panoptic-PartFormer++ achieves 2% PartPQ and 3% PWQ improvements on the Cityscapes PPS dataset and 5% PartPQ on the Pascal Context PPS dataset. On both datasets, Panoptic-PartFormer++ achieves new state-of-the-art results with a significant cost drop of 70% on GFlops and 50% on parameters. Our models can serve as a strong baseline and aid future research in PPS. Code will be available.
translated by 谷歌翻译