Heating in private households is a major contributor to the emissions generated today. Heat pumps are a promising alternative for heat generation and are a key technology in achieving our goals of the German energy transformation and to become less dependent on fossil fuels. Today, the majority of heat pumps in the field are controlled by a simple heating curve, which is a naive mapping of the current outdoor temperature to a control action. A more advanced control approach is model predictive control (MPC) which was applied in multiple research works to heat pump control. However, MPC is heavily dependent on the building model, which has several disadvantages. Motivated by this and by recent breakthroughs in the field, this work applies deep reinforcement learning (DRL) to heat pump control in a simulated environment. Through a comparison to MPC, it could be shown that it is possible to apply DRL in a model-free manner to achieve MPC-like performance. This work extends other works which have already applied DRL to building heating operation by performing an in-depth analysis of the learned control strategies and by giving a detailed comparison of the two state-of-the-art control methods.
translated by 谷歌翻译
The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.
translated by 谷歌翻译
In recent years, several metrics have been developed for evaluating group fairness of rankings. Given that these metrics were developed with different application contexts and ranking algorithms in mind, it is not straightforward which metric to choose for a given scenario. In this paper, we perform a comprehensive comparative analysis of existing group fairness metrics developed in the context of fair ranking. By virtue of their diverse application contexts, we argue that such a comparative analysis is not straightforward. Hence, we take an axiomatic approach whereby we design a set of thirteen properties for group fairness metrics that consider different ranking settings. A metric can then be selected depending on whether it satisfies all or a subset of these properties. We apply these properties on eleven existing group fairness metrics, and through both empirical and theoretical results we demonstrate that most of these metrics only satisfy a small subset of the proposed properties. These findings highlight limitations of existing metrics, and provide insights into how to evaluate and interpret different fairness metrics in practical deployment. The proposed properties can also assist practitioners in selecting appropriate metrics for evaluating fairness in a specific application.
translated by 谷歌翻译
In recent years distributional reinforcement learning has produced many state of the art results. Increasingly sample efficient Distributional algorithms for the discrete action domain have been developed over time that vary primarily in the way they parameterize their approximations of value distributions, and how they quantify the differences between those distributions. In this work we transfer three of the most well-known and successful of those algorithms (QR-DQN, IQN and FQF) to the continuous action domain by extending two powerful actor-critic algorithms (TD3 and SAC) with distributional critics. We investigate whether the relative performance of the methods for the discrete action space translates to the continuous case. To that end we compare them empirically on the pybullet implementations of a set of continuous control tasks. Our results indicate qualitative invariance regarding the number and placement of distributional atoms in the deterministic, continuous action setting.
translated by 谷歌翻译
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
translated by 谷歌翻译
In this paper, we present a modular methodology that combines state-of-the-art methods in (stochastic) machine learning with traditional methods in rule learning to provide efficient and scalable algorithms for the classification of vast data sets, while remaining explainable. Apart from evaluating our approach on the common large scale data sets MNIST, Fashion-MNIST and IMDB, we present novel results on explainable classifications of dental bills. The latter case study stems from an industrial collaboration with Allianz Private Krankenversicherungs-Aktiengesellschaft which is an insurance company offering diverse services in Germany.
translated by 谷歌翻译
Key Point Analysis(KPA) is a relatively new task in NLP that combines summarization and classification by extracting argumentative key points (KPs) for a topic from a collection of texts and categorizing their closeness to the different arguments. In our work, we focus on the legal domain and develop methods that identify and extract KPs from premises derived from texts of judgments. The first method is an adaptation to an existing state-of-the-art method, and the two others are new methods that we developed from scratch. We present our methods and examples of their outputs, as well a comparison between them. The full evaluation of our results is done in the matching task -- match between the generated KPs to arguments (premises).
translated by 谷歌翻译
Active learning as a paradigm in deep learning is especially important in applications involving intricate perception tasks such as object detection where labels are difficult and expensive to acquire. Development of active learning methods in such fields is highly computationally expensive and time consuming which obstructs the progression of research and leads to a lack of comparability between methods. In this work, we propose and investigate a sandbox setup for rapid development and transparent evaluation of active learning in deep object detection. Our experiments with commonly used configurations of datasets and detection architectures found in the literature show that results obtained in our sandbox environment are representative of results on standard configurations. The total compute time to obtain results and assess the learning behavior can thereby be reduced by factors of up to 14 when comparing with Pascal VOC and up to 32 when comparing with BDD100k. This allows for testing and evaluating data acquisition and labeling strategies in under half a day and contributes to the transparency and development speed in the field of active learning for object detection.
translated by 谷歌翻译
This paper describes an evaluation of Automated Theorem Proving (ATP) systems on problems taken from the QMLTP library of first-order modal logic problems. Principally, the problems are translated to higher-order logic in the TPTP languages using an embedding approach, and solved using higher-order logic ATP systems. Additionally, the results from native modal logic ATP systems are considered, and compared with those from the embedding approach. The conclusions are that (i) The embedding process is reliable and successful. (ii) The choice of backend ATP system can significantly impact the performance of the embedding approach. (iii) Native modal logic ATP systems outperform the embedding approach. (iv) The embedding approach can cope with a wider range modal logics than the native modal systems considered.
translated by 谷歌翻译
We deal with a challenging scheduling problem on parallel machines with sequence-dependent setup times and release dates from a real-world application of semiconductor work-shop production. There, jobs can only be processed by dedicated machines, thus few machines can determine the makespan almost regardless of how jobs are scheduled on the remaining ones. This causes problems when machines fail and jobs need to be rescheduled. Instead of optimising only the makespan, we put the individual machine spans in non-ascending order and lexicographically minimise the resulting tuples. This achieves that all machines complete as early as possible and increases the robustness of the schedule. We study the application of Answer-Set Programming (ASP) to solve this problem. While ASP eases modelling, the combination of timing constraints and the considered objective function challenges current solving technology. The former issue is addressed by using an extension of ASP by difference logic. For the latter, we devise different algorithms that use multi-shot solving. To tackle industrial-sized instances, we study different approximations and heuristics. Our experimental results show that ASP is indeed a promising KRR paradigm for this problem and is competitive with state-of-the-art CP and MIP solvers. Under consideration in Theory and Practice of Logic Programming (TPLP).
translated by 谷歌翻译