本文介绍了一种增强学习方法,以更好地概括有关工作店调度问题(JSP)的启发式调度规则。 JSP上的当前模型并不关注概括,尽管正如我们在这项工作中所显示的那样,这是对问题进行更好的启发式方法的关键。改善概括的一种众所周知的技术是使用课程学习(CL)学习日益复杂的实例。但是,正如文献中许多作品所表明的那样,在不同问题大小之间传递学习技能时,这种技术可能会遭受灾难性的遗忘。为了解决这个问题,我们引入了一种新颖的对抗性课程学习(ACL)策略,该策略在学习过程中动态调整了难度级别以重新审视最坏情况的实例。这项工作还提出了一个深度学习模型来解决JSP,这是e var的W.R.T.作业定义和尺寸不可能。对Taillard和Demirkol的实例进行了实验,表明所提出的方法显着改善了JSP上的最新模型。它的平均最佳差距从Taillard的实例中的平均最佳差距从19.35 \%降低到10.46 \%,而Demirkol的实例中的平均最佳差距从38.43 \%降低到18.85%。我们的实施可在线提供。
translated by 谷歌翻译
文本到图像综合的目标是生成与给定文本描述匹配的视觉现实图像。在实践中,人类注释的标题在同一图像中具有很大的内容方差和单词的选择。相同图像的标题之间的语言差异导致偏离地面真理的合成图像。为了解决这个问题,我们提出了一种对比的学习方法来提高质量,增强合成图像的语义一致性。在预先预测阶段,我们利用对比的学习方法来学习对应于相同图像的标题的一致文本表示。此外,在GaN训练的以下阶段,我们采用对比学习方法来增强来自与相同图像相关的标题的所生成的图像之间的一致性。我们分别评估了我们在数据集幼崽和Coco上的两个流行文本到图像综合模型,ATTNGAN和DM-GAN的方法。实验结果表明,我们的方法可以有效地提高三个度量的合成图像的质量:是,FID和R精度。特别是,在挑战的Coco DataSet上,我们的方法将FID显着地通过29.60%的Attngan来增强29.60%,并在DM-GaN中达到21.96%。
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
Landing an unmanned aerial vehicle unmanned aerial vehicle (UAV) on top of an unmanned surface vehicle (USV) in harsh open waters is a challenging problem, owing to forces that can damage the UAV due to a severe roll and/or pitch angle of the USV during touchdown. To tackle this, we propose a novel model predictive control (MPC) approach enabling a UAV to land autonomously on a USV in these harsh conditions. The MPC employs a novel objective function and an online decomposition of the oscillatory motion of the vessel to predict, attempt, and accomplish the landing during near-zero tilt of the landing platform. The nonlinear prediction of the motion of the vessel is performed using visual data from an onboard camera. Therefore, the system does not require any communication with the USV or a control station. The proposed method was analyzed in numerous robotics simulations in harsh and extreme conditions and further validated in various real-world scenarios.
translated by 谷歌翻译
We develop theory and methods that use the graph Laplacian to analyze the geometry of the underlying manifold of point clouds. Our theory provides theoretical guarantees and explicit bounds on the functional form of the graph Laplacian, in the case when it acts on functions defined close to singularities of the underlying manifold. We also propose methods that can be used to estimate these geometric properties of the point cloud, which are based on the theoretical guarantees.
translated by 谷歌翻译
Nearly all jurisdictions in the United States require a professional license exam, commonly referred to as "the Bar Exam," as a precondition for law practice. To even sit for the exam, most jurisdictions require that an applicant completes at least seven years of post-secondary education, including three years at an accredited law school. In addition, most test-takers also undergo weeks to months of further, exam-specific preparation. Despite this significant investment of time and capital, approximately one in five test-takers still score under the rate required to pass the exam on their first try. In the face of a complex task that requires such depth of knowledge, what, then, should we expect of the state of the art in "AI?" In this research, we document our experimental evaluation of the performance of OpenAI's `text-davinci-003` model, often-referred to as GPT-3.5, on the multistate multiple choice (MBE) section of the exam. While we find no benefit in fine-tuning over GPT-3.5's zero-shot performance at the scale of our training data, we do find that hyperparameter optimization and prompt engineering positively impacted GPT-3.5's zero-shot performance. For best prompt and parameters, GPT-3.5 achieves a headline correct rate of 50.3% on a complete NCBE MBE practice exam, significantly in excess of the 25% baseline guessing rate, and performs at a passing rate for both Evidence and Torts. GPT-3.5's ranking of responses is also highly-correlated with correctness; its top two and top three choices are correct 71% and 88% of the time, respectively, indicating very strong non-entailment performance. While our ability to interpret these results is limited by nascent scientific understanding of LLMs and the proprietary nature of GPT, we believe that these results strongly suggest that an LLM will pass the MBE component of the Bar Exam in the near future.
translated by 谷歌翻译
The future of population-based breast cancer screening is likely personalized strategies based on clinically relevant risk models. Mammography-based risk models should remain robust to domain shifts caused by different populations and mammographic devices. Modern risk models do not ensure adaptation across vendor-domains and are often conflated to unintentionally rely on both precursors of cancer and systemic/global mammographic information associated with short- and long-term risk, respectively, which might limit performance. We developed a robust, cross-vendor model for long-term risk assessment. An augmentation-based domain adaption technique, based on flavorization of mammographic views, ensured generalization to an unseen vendor-domain. We trained on samples without diagnosed/potential malignant findings to learn systemic/global breast tissue features, called mammographic texture, indicative of future breast cancer. However, training so may cause erratic convergence. By excluding noise-inducing samples and designing a case-control dataset, a robust ensemble texture model was trained. This model was validated in two independent datasets. In 66,607 Danish women with flavorized Siemens views, the AUC was 0.71 and 0.65 for prediction of interval cancers within two years (ICs) and from two years after screening (LTCs), respectively. In a combination with established risk factors, the model's AUC increased to 0.68 for LTCs. In 25,706 Dutch women with Hologic-processed views, the AUCs were not different from the AUCs in Danish women with flavorized views. The results suggested that the model robustly estimated long-term risk while adapting to an unseen processed vendor-domain. The model identified 8.1% of Danish women accounting for 20.9% of ICs and 14.2% of LTCs.
translated by 谷歌翻译
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.
translated by 谷歌翻译
In this work, a method for obtaining pixel-wise error bounds in Bayesian regularization of inverse imaging problems is introduced. The proposed method employs estimates of the posterior variance together with techniques from conformal prediction in order to obtain coverage guarantees for the error bounds, without making any assumption on the underlying data distribution. It is generally applicable to Bayesian regularization approaches, independent, e.g., of the concrete choice of the prior. Furthermore, the coverage guarantees can also be obtained in case only approximate sampling from the posterior is possible. With this in particular, the proposed framework is able to incorporate any learned prior in a black-box manner. Guaranteed coverage without assumptions on the underlying distributions is only achievable since the magnitude of the error bounds is, in general, unknown in advance. Nevertheless, experiments with multiple regularization approaches presented in the paper confirm that in practice, the obtained error bounds are rather tight. For realizing the numerical experiments, also a novel primal-dual Langevin algorithm for sampling from non-smooth distributions is introduced in this work.
translated by 谷歌翻译
The goal of this paper is to detect objects by exploiting their interrelationships. Rather than relying on predefined and labeled graph structures, we infer a graph prior from object co-occurrence statistics. The key idea of our paper is to model object relations as a function of initial class predictions and co-occurrence priors to generate a graph representation of an image for improved classification and bounding box regression. We additionally learn the object-relation joint distribution via energy based modeling. Sampling from this distribution generates a refined graph representation of the image which in turn produces improved detection performance. Experiments on the Visual Genome and MS-COCO datasets demonstrate our method is detector agnostic, end-to-end trainable, and especially beneficial for rare object classes. What is more, we establish a consistent improvement over object detectors like DETR and Faster-RCNN, as well as state-of-the-art methods modeling object interrelationships.
translated by 谷歌翻译