现代软件系统和产品越来越依赖机器学习模型,以基于与用户和系统的交互进行数据驱动的决策,例如计算基础架构。对于更广泛的采用,这种做法必须(i)容纳没有ML背景的软件工程师,并提供(ii)提供优化产品目标的机制。在这项工作中,我们描述了一般原则和特定的端到端毫升平台,为决策和反馈集合提供易于使用的API。循环仪支持从在线数据收集到模拟培训,部署,推理的完整端到端ML生命周期,并扩展支持和调整产品目标的评估和调整。我们概述了平台架构和生产部署的整体影响 - 循环仪当前托管700毫升型号,每秒达到600万决定。我们还描述了学习曲线并总结了平台采用者的经验。
translated by 谷歌翻译
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
translated by 谷歌翻译
Identifying named entities such as a person, location or organization, in documents can highlight key information to readers. Training Named Entity Recognition (NER) models requires an annotated data set, which can be a time-consuming labour-intensive task. Nevertheless, there are publicly available NER data sets for general English. Recently there has been interest in developing NER for legal text. However, prior work and experimental results reported here indicate that there is a significant degradation in performance when NER methods trained on a general English data set are applied to legal text. We describe a publicly available legal NER data set, called E-NER, based on legal company filings available from the US Securities and Exchange Commission's EDGAR data set. Training a number of different NER algorithms on the general English CoNLL-2003 corpus but testing on our test collection confirmed significant degradations in accuracy, as measured by the F1-score, of between 29.4\% and 60.4\%, compared to training and testing on the E-NER collection.
translated by 谷歌翻译
Despite the current success of multilingual pre-training, most prior works focus on leveraging monolingual data or bilingual parallel data and overlooked the value of trilingual parallel data. This paper presents \textbf{Tri}angular Document-level \textbf{P}re-training (\textbf{TRIP}), which is the first in the field to extend the conventional monolingual and bilingual pre-training to a trilingual setting by (i) \textbf{Grafting} the same documents in two languages into one mixed document, and (ii) predicting the remaining one language as the reference translation. Our experiments on document-level MT and cross-lingual abstractive summarization show that TRIP brings by up to 3.65 d-BLEU points and 6.2 ROUGE-L points on three multilingual document-level machine translation benchmarks and one cross-lingual abstractive summarization benchmark, including multiple strong state-of-the-art (SOTA) scores. In-depth analysis indicates that TRIP improves document-level machine translation and captures better document contexts in at least three characteristics: (i) tense consistency, (ii) noun consistency and (iii) conjunction presence.
translated by 谷歌翻译
Machine Learning (ML) approaches have been used to enhance the detection capabilities of Network Intrusion Detection Systems (NIDSs). Recent work has achieved near-perfect performance by following binary- and multi-class network anomaly detection tasks. Such systems depend on the availability of both (benign and malicious) network data classes during the training phase. However, attack data samples are often challenging to collect in most organisations due to security controls preventing the penetration of known malicious traffic to their networks. Therefore, this paper proposes a Deep One-Class (DOC) classifier for network intrusion detection by only training on benign network data samples. The novel one-class classification architecture consists of a histogram-based deep feed-forward classifier to extract useful network data features and use efficient outlier detection. The DOC classifier has been extensively evaluated using two benchmark NIDS datasets. The results demonstrate its superiority over current state-of-the-art one-class classifiers in terms of detection and false positive rates.
translated by 谷歌翻译
There exists unexplained diverse variation within the predefined colon cancer stages using only features either from genomics or histopathological whole slide images as prognostic factors. Unraveling this variation will bring about improved in staging and treatment outcome, hence motivated by the advancement of Deep Neural Network libraries and different structures and factors within some genomic dataset, we aggregate atypical patterns in histopathological images with diverse carcinogenic expression from mRNA, miRNA and DNA Methylation as an integrative input source into an ensemble deep neural network for colon cancer stages classification and samples stratification into low or high risk survival groups. The results of our Ensemble Deep Convolutional Neural Network model show an improved performance in stages classification on the integrated dataset. The fused input features return Area under curve Receiver Operating Characteristic curve (AUC ROC) of 0.95 compared with AUC ROC of 0.71 and 0.68 obtained when only genomics and images features are used for the stage's classification, respectively. Also, the extracted features were used to split the patients into low or high risk survival groups. Among the 2548 fused features, 1695 features showed a statistically significant survival probability differences between the two risk groups defined by the extracted features.
translated by 谷歌翻译
We present Pre-trained Machine Reader (PMR), a novel method to retrofit Pre-trained Language Models (PLMs) into Machine Reading Comprehension (MRC) models without acquiring labeled data. PMR is capable of resolving the discrepancy between model pre-training and downstream fine-tuning of existing PLMs, and provides a unified solver for tackling various extraction tasks. To achieve this, we construct a large volume of general-purpose and high-quality MRC-style training data with the help of Wikipedia hyperlinks and design a Wiki Anchor Extraction task to guide the MRC-style pre-training process. Although conceptually simple, PMR is particularly effective in solving extraction tasks including Extractive Question Answering and Named Entity Recognition, where it shows tremendous improvements over previous approaches especially under low-resource settings. Moreover, viewing sequence classification task as a special case of extraction task in our MRC formulation, PMR is even capable to extract high-quality rationales to explain the classification process, providing more explainability of the predictions.
translated by 谷歌翻译
To meet the fairly high safety and reliability requirements in practice, the state of health (SOH) estimation of Lithium-ion batteries (LIBs), which has a close relationship with the degradation performance, has been extensively studied with the widespread applications of various electronics. The conventional SOH estimation approaches with digital twin are end-of-cycle estimation that require the completion of a full charge/discharge cycle to observe the maximum available capacity. However, under dynamic operating conditions with partially discharged data, it is impossible to sense accurate real-time SOH estimation for LIBs. To bridge this research gap, we put forward a digital twin framework to gain the capability of sensing the battery's SOH on the fly, updating the physical battery model. The proposed digital twin solution consists of three core components to enable real-time SOH estimation without requiring a complete discharge. First, to handle the variable training cycling data, the energy discrepancy-aware cycling synchronization is proposed to align cycling data with guaranteeing the same data structure. Second, to explore the temporal importance of different training sampling times, a time-attention SOH estimation model is developed with data encoding to capture the degradation behavior over cycles, excluding adverse influences of unimportant samples. Finally, for online implementation, a similarity analysis-based data reconstruction has been put forward to provide real-time SOH estimation without requiring a full discharge cycle. Through a series of results conducted on a widely used benchmark, the proposed method yields the real-time SOH estimation with errors less than 1% for most sampling times in ongoing cycles.
translated by 谷歌翻译
Neglected tropical diseases (NTDs) continue to affect the livelihood of individuals in countries in the Southeast Asia and Western Pacific region. These diseases have been long existing and have caused devastating health problems and economic decline to people in low- and middle-income (developing) countries. An estimated 1.7 billion of the world's population suffer one or more NTDs annually, this puts approximately one in five individuals at risk for NTDs. In addition to health and social impact, NTDs inflict significant financial burden to patients, close relatives, and are responsible for billions of dollars lost in revenue from reduced labor productivity in developing countries alone. There is an urgent need to better improve the control and eradication or elimination efforts towards NTDs. This can be achieved by utilizing machine learning tools to better the surveillance, prediction and detection program, and combat NTDs through the discovery of new therapeutics against these pathogens. This review surveys the current application of machine learning tools for NTDs and the challenges to elevate the state-of-the-art of NTDs surveillance, management, and treatment.
translated by 谷歌翻译
Cyber intrusion attacks that compromise the users' critical and sensitive data are escalating in volume and intensity, especially with the growing connections between our daily life and the Internet. The large volume and high complexity of such intrusion attacks have impeded the effectiveness of most traditional defence techniques. While at the same time, the remarkable performance of the machine learning methods, especially deep learning, in computer vision, had garnered research interests from the cyber security community to further enhance and automate intrusion detections. However, the expensive data labeling and limitation of anomalous data make it challenging to train an intrusion detector in a fully supervised manner. Therefore, intrusion detection based on unsupervised anomaly detection is an important feature too. In this paper, we propose a three-stage deep learning anomaly detection based network intrusion attack detection framework. The framework comprises an integration of unsupervised (K-means clustering), semi-supervised (GANomaly) and supervised learning (CNN) algorithms. We then evaluated and showed the performance of our implemented framework on three benchmark datasets: NSL-KDD, CIC-IDS2018, and TON_IoT.
translated by 谷歌翻译