This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译
电线杆和建筑物边缘经常是城市道路上可观察到的对象,为各种计算机视觉任务提供了可靠的提示。为了重复提取它们作为特征并在离散激光镜头框架之间进行注册,我们提出了第一个基于学习的功能分割和LIDAR点云中3D线的描述模型。为了训练我们的模型,而无需耗时和乏味的数据标记过程,我们首先生成了目标线基本外观的合成原始图,并构建一个迭代线自动标记的过程,以逐步完善真实激光扫描的线路标签。我们的分割模型可以在任意规模的扰动下提取线,我们使用共享的EDGECONV编码层共同训练两个分割和描述符头。基于模型,我们可以在没有初始转换提示的情况下构建一个高度可用的全局注册模块,用于点云注册。实验表明,我们基于线的注册方法对基于最先进的方法的方法具有很高的竞争力。我们的代码可在https://github.com/zxrzju/superline3d.git上找到。
translated by 谷歌翻译
RICCI流是用于在riemannian歧管中发展度量的局部微分方程,使其更加规则。然而,在大多数情况下,RICCI流体倾向于发展奇点并导致溶液的分歧。在本文中,我们提出了线性近的欧几里德度量来帮助歧管微手术,这意味着我们证明了在RICCI-DECURCK流下靠近线性近欧几里德度量的度量的动态稳定性和收敛性。在实践中,从信息几何和镜像下降观点来看,我们为线性接近欧几里德歧管上的神经网络提供了最陡的渐变流程。在神经网络的培训过程中,我们观察到其度量还将经常融合到线性近的欧几里德公制,这与Ricci-Demurck流程下线性接近欧几里德歧管的收敛行为一致。
translated by 谷歌翻译
在推荐系统中,一个普遍的挑战是冷门问题,在系统中,相互作用非常有限。为了应对这一挑战,最近,许多作品将元优化的想法介绍到建议方案中,即学习仅通过过去的几个交互项目来学习用户偏好。核心想法是为所有用户学习全局共享的元启动参数,并分别为每个用户迅速调整其本地参数。他们的目的是在各种用户的偏好学习中得出一般知识,以便通过博学的先验和少量培训数据迅速适应未来的新用户。但是,以前的作品表明,推荐系统通常容易受到偏见和不公平的影响。尽管元学习成功地通过冷启动提高了推荐性能,但公平性问题在很大程度上被忽略了。在本文中,我们提出了一个名为Clover的全面的公平元学习框架,以确保元学习的推荐模型的公平性。我们系统地研究了三种公平性 - 个人公平,反事实公平和推荐系统中的群体公平,并建议通过多任务对抗学习方案满足所有三种类型。我们的框架提供了一种通用的培训范式,适用于不同的元学习推荐系统。我们证明了三叶草对三个现实世界数据集的代表性元学习用户偏好估计器的有效性。经验结果表明,三叶草可以实现全面的公平性,而不会恶化整体的冷淡建议性能。
translated by 谷歌翻译
从任意堕落状态中起床是一种基本的人类技能。现有的学习这种技能的方法通常会产生高度动态和不稳定的起床动作,这不像人类的起床策略,或者基于跟踪记录的人类起床运动。在本文中,我们提出了一种使用强化学习的分阶段方法,而无需求助于运动捕获数据。该方法首先利用了强大的字符模型,从而有助于发现解决方案模式。然后,第二阶段学会了调整控制策略,以逐步与角色的较弱版本一起使用。最后,第三阶段学习控制政策,这些政策可以以较慢的速度重现较弱的起床动作。我们表明,在多个运行中,该方法可以发现各种各样的起床策略,并以各种速度执行它们。结果通常会产生采用最终站立策略的策略,这些策略是从所有初始状态中看到的恢复动作所共有的。但是,我们还发现了对俯卧和仰卧初始堕落状态的不同策略的政策。学识渊博的起床控制策略通常具有明显的静态稳定性,即,在起床运动过程中,它们可以在各个点停下来。我们进一步测试了新的限制场景的方法,例如在演员表中有一条腿和手臂。
translated by 谷歌翻译
Supervised Question Answering systems (QA systems) rely on domain-specific human-labeled data for training. Unsupervised QA systems generate their own question-answer training pairs, typically using secondary knowledge sources to achieve this outcome. Our approach (called PIE-QG) uses Open Information Extraction (OpenIE) to generate synthetic training questions from paraphrased passages and uses the question-answer pairs as training data for a language model for a state-of-the-art QA system based on BERT. Triples in the form of <subject, predicate, object> are extracted from each passage, and questions are formed with subjects (or objects) and predicates while objects (or subjects) are considered as answers. Experimenting on five extractive QA datasets demonstrates that our technique achieves on-par performance with existing state-of-the-art QA systems with the benefit of being trained on an order of magnitude fewer documents and without any recourse to external reference data sources.
translated by 谷歌翻译
Transformer has achieved impressive successes for various computer vision tasks. However, most of existing studies require to pretrain the Transformer backbone on a large-scale labeled dataset (e.g., ImageNet) for achieving satisfactory performance, which is usually unavailable for medical images. Additionally, due to the gap between medical and natural images, the improvement generated by the ImageNet pretrained weights significantly degrades while transferring the weights to medical image processing tasks. In this paper, we propose Bootstrap Own Latent of Transformer (BOLT), a self-supervised learning approach specifically for medical image classification with the Transformer backbone. Our BOLT consists of two networks, namely online and target branches, for self-supervised representation learning. Concretely, the online network is trained to predict the target network representation of the same patch embedding tokens with a different perturbation. To maximally excavate the impact of Transformer from limited medical data, we propose an auxiliary difficulty ranking task. The Transformer is enforced to identify which branch (i.e., online/target) is processing the more difficult perturbed tokens. Overall, the Transformer endeavours itself to distill the transformation-invariant features from the perturbed tokens to simultaneously achieve difficulty measurement and maintain the consistency of self-supervised representations. The proposed BOLT is evaluated on three medical image processing tasks, i.e., skin lesion classification, knee fatigue fracture grading and diabetic retinopathy grading. The experimental results validate the superiority of our BOLT for medical image classification, compared to ImageNet pretrained weights and state-of-the-art self-supervised learning approaches.
translated by 谷歌翻译
Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
translated by 谷歌翻译
Digital engineering transformation is a crucial process for the engineering paradigm shifts in the fourth industrial revolution (4IR), and artificial intelligence (AI) is a critical enabling technology in digital engineering transformation. This article discusses the following research questions: What are the fundamental changes in the 4IR? More specifically, what are the fundamental changes in engineering? What is digital engineering? What are the main uncertainties there? What is trustworthy AI? Why is it important today? What are emerging engineering paradigm shifts in the 4IR? What is the relationship between the data-intensive paradigm and digital engineering transformation? What should we do for digitalization? From investigating the pattern of industrial revolutions, this article argues that ubiquitous machine intelligence (uMI) is the defining power brought by the 4IR. Digitalization is a condition to leverage ubiquitous machine intelligence. Digital engineering transformation towards Industry 4.0 has three essential building blocks: digitalization of engineering, leveraging ubiquitous machine intelligence, and building digital trust and security. The engineering design community at large is facing an excellent opportunity to bring the new capabilities of ubiquitous machine intelligence and trustworthy AI principles, as well as digital trust, together in various engineering systems design to ensure the trustworthiness of systems in Industry 4.0.
translated by 谷歌翻译
Surgical robot automation has attracted increasing research interest over the past decade, expecting its huge potential to benefit surgeons, nurses and patients. Recently, the learning paradigm of embodied AI has demonstrated promising ability to learn good control policies for various complex tasks, where embodied AI simulators play an essential role to facilitate relevant researchers. However, existing open-sourced simulators for surgical robot are still not sufficiently supporting human interactions through physical input devices, which further limits effective investigations on how human demonstrations would affect policy learning. In this paper, we study human-in-the-loop embodied intelligence with a new interactive simulation platform for surgical robot learning. Specifically, we establish our platform based on our previously released SurRoL simulator with several new features co-developed to allow high-quality human interaction via an input device. With these, we further propose to collect human demonstrations and imitate the action patterns to achieve more effective policy learning. We showcase the improvement of our simulation environment with the designed new features and tasks, and validate state-of-the-art reinforcement learning algorithms using the interactive environment. Promising results are obtained, with which we hope to pave the way for future research on surgical embodied intelligence. Our platform is released and will be continuously updated in the website: https://med-air.github.io/SurRoL/
translated by 谷歌翻译