COVID-19大流行已经暴露了全球医疗服务的脆弱性,增加了开发新颖的工具来提供快速且具有成本效益的筛查和诊断的需求。临床报告表明,Covid-19感染可能导致心脏损伤,心电图(ECG)可以作为Covid-19的诊断生物标志物。这项研究旨在利用ECG信号自动检测COVID-19。我们提出了一种从ECG纸记录中提取ECG信号的新方法,然后将其送入一维卷积神经网络(1D-CNN)中,以学习和诊断疾病。为了评估数字信号的质量,标记了基于纸张的ECG图像中的R峰。之后,将从每个图像计算的RR间隔与相应数字化信号的RR间隔进行比较。 COVID-19 ECG图像数据集上的实验表明,提出的数字化方法能够正确捕获原始信号,平均绝对误差为28.11 ms。我们提出的1D-CNN模型在数字化的心电图信号上进行了训练,允许准确识别患有COVID-19和其他受试者的个体,分类精度为98.42%,95.63%和98.50%,用于分类COVID-19 vs.正常,与正常人分类, COVID-19与异常心跳和Covid-19和其他类别分别与其他阶级。此外,提出的方法还为多分类任务实现了高级的性能。我们的发现表明,经过数字化的心电图信号训练的深度学习系统可以作为诊断Covid-19的潜在工具。
translated by 谷歌翻译
尽管在自动语音识别(ASR)中最近的表现方法增加了,但这种方法并不能确保其输出的适当套管和标点符号。这个问题对自然语言处理(NLP)算法和人类的理解都有重大影响。对于原始文本输入的预处理管道,必须进行资本化和标点符号恢复。对于越南人等低资源语言,此任务的公共数据集很少。在本文中,我们为越南人的资本化和标点符号恢复贡献了一个公共数据集;并提出了两个名为intercappunc的任务的联合模型。越南数据集的实验结果显示了我们联合模型的有效性与单个模型和先前的联合学习模型相比。我们在https://github.com/anhtunguyen98/jointcappund上公开发布数据集和模型的实现
translated by 谷歌翻译
Automatically generated static code warnings suffer from a large number of false alarms. Hence, developers only take action on a small percent of those warnings. To better predict which static code warnings should not be ignored, we suggest that analysts need to look deeper into their algorithms to find choices that better improve the particulars of their specific problem. Specifically, we show here that effective predictors of such warnings can be created by methods that locally adjust the decision boundary (between actionable warnings and others). These methods yield a new high water-mark for recognizing actionable static code warnings. For eight open-source Java projects (cassandra, jmeter, commons, lucene-solr, maven, ant, tomcat, derby) we achieve perfect test results on 4/8 datasets and, overall, a median AUC (area under the true negatives, true positives curve) of 92%.
translated by 谷歌翻译
在机器学习模型道德偏见已经成为软件工程界关注的一个问题。大多数现有软件工程的作品集中在模型寻找道德偏见,而不是修复它。发现偏差后,下一步就是缓解。在此之前研究人员主要是试图利用监督的方法来实现公平。与值得信赖的地面实况然而,在现实世界中,获得的数据是具有挑战性的,也基本事实可以包含人为偏差。半监督学习是一种机器学习技术,其中,递增地,标记的数据被用于生成伪标签中的数据的剩余部分(然后全部数据被用于模型训练)。在这项工作中,我们采用四种常用的半监督技术作为伪贴标创造公平分类模型。我们的框架,公平SSL,需要标记的数据的一个非常小的量(10%)作为输入,并为未标记的数据生成伪标签。然后,我们综合生成新的数据点,以平衡基础类,并提议Chakraborty等人的保护属性的训练数据。在2021年FSE最后,分类模型被训练在平衡伪标记的数据和测试数据进行了验证。实验十项数据集和三个学生后,我们发现,公平SSL实现了性能先进设备,最先进的三个偏置抑制算法类似。这就是说,公平SSL的明显优势在于,它仅需要10%的标记的训练数据。据我们所知,这是在半监督技术被用来针对SE型号ML道德偏见争第一SE工作。
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail prediction problems have been widely studied in many applications, but only been addressed by a few studies for ASR and LMs. In this paper, we propose a new memory augmented lookup dictionary based Transformer architecture for LM. The newly introduced lookup dictionary incorporates rich contextual information in training set, which is vital to correctly predict long-tail tokens. With intensive experiments on Chinese and English data sets, our proposed method is proved to outperform the baseline Transformer LM by a great margin on both word/character error rate and tail tokens error rate. This is achieved without impact on the decoding efficiency. Overall, we demonstrate the effectiveness of our proposed method in boosting the ASR decoding performance, especially for long-tail tokens.
translated by 谷歌翻译
We present a Machine Learning (ML) study case to illustrate the challenges of clinical translation for a real-time AI-empowered echocardiography system with data of ICU patients in LMICs. Such ML case study includes data preparation, curation and labelling from 2D Ultrasound videos of 31 ICU patients in LMICs and model selection, validation and deployment of three thinner neural networks to classify apical four-chamber view. Results of the ML heuristics showed the promising implementation, validation and application of thinner networks to classify 4CV with limited datasets. We conclude this work mentioning the need for (a) datasets to improve diversity of demographics, diseases, and (b) the need of further investigations of thinner models to be run and implemented in low-cost hardware to be clinically translated in the ICU in LMICs. The code and other resources to reproduce this work are available at https://github.com/vital-ultrasound/ai-assisted-echocardiography-for-low-resource-countries.
translated by 谷歌翻译
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.
translated by 谷歌翻译
Summary quality assessment metrics have two categories: reference-based and reference-free. Reference-based metrics are theoretically more accurate but are limited by the availability and quality of the human-written references, which are both difficulty to ensure. This inspires the development of reference-free metrics, which are independent from human-written references, in the past few years. However, existing reference-free metrics cannot be both zero-shot and accurate. In this paper, we propose a zero-shot but accurate reference-free approach in a sneaky way: feeding documents, based upon which summaries generated, as references into reference-based metrics. Experimental results show that this zero-shot approach can give us the best-performing reference-free metrics on nearly all aspects on several recently-released datasets, even beating reference-free metrics specifically trained for this task sometimes. We further investigate what reference-based metrics can benefit from such repurposing and whether our additional tweaks help.
translated by 谷歌翻译
Ultra-fine entity typing (UFET) predicts extremely free-formed types (e.g., president, politician) of a given entity mention (e.g., Joe Biden) in context. State-of-the-art (SOTA) methods use the cross-encoder (CE) based architecture. CE concatenates the mention (and its context) with each type and feeds the pairs into a pretrained language model (PLM) to score their relevance. It brings deeper interaction between mention and types to reach better performance but has to perform N (type set size) forward passes to infer types of a single mention. CE is therefore very slow in inference when the type set is large (e.g., N = 10k for UFET). To this end, we propose to perform entity typing in a recall-expand-filter manner. The recall and expand stages prune the large type set and generate K (K is typically less than 256) most relevant type candidates for each mention. At the filter stage, we use a novel model called MCCE to concurrently encode and score these K candidates in only one forward pass to obtain the final type prediction. We investigate different variants of MCCE and extensive experiments show that MCCE under our paradigm reaches SOTA performance on ultra-fine entity typing and is thousands of times faster than the cross-encoder. We also found MCCE is very effective in fine-grained (130 types) and coarse-grained (9 types) entity typing. Our code is available at \url{https://github.com/modelscope/AdaSeq/tree/master/examples/MCCE}.
translated by 谷歌翻译