This paper aims to provide an unsupervised modelling approach that allows for a more flexible representation of text embeddings. It jointly encodes the words and the paragraphs as individual matrices of arbitrary column dimension with unit Frobenius norm. The representation is also linguistically motivated with the introduction of a novel similarity metric. The proposed modelling and the novel similarity metric exploits the matrix structure of embeddings. We then go on to show that the same matrices can be reshaped into vectors of unit norm and transform our problem into an optimization problem over the spherical manifold. We exploit manifold optimization to efficiently train the matrix embeddings. We also quantitatively verify the quality of our text embeddings by showing that they demonstrate improved results in document classification, document clustering, and semantic textual similarity benchmark tests.
translated by 谷歌翻译
在本文中,我们通过推断在歧管上的迭代来提出一种简单的加速度方案,用于利曼梯度方法。我们显示何时从Riemannian梯度下降法生成迭代元素,加速方案是渐近地达到最佳收敛速率,并且比最近提出的Riemannian Nesterov加速梯度方法在计算上更有利。我们的实验验证了新型加速策略的实际好处。
translated by 谷歌翻译
最近,已经显示,与流行的基于Kullback Leibler(KL)的正则化不同,基于最佳运输(OT)的最大平均差异(MMD)正则化导致了对估计样品复杂性的无维度。另一方面,分别使用总变异和基于KL的正规化来定义有趣的指标类别(GHK)等有趣的指标类别和高斯 - 赫林格 - 坎托维奇(GHK)指标。但是,如果可以使用样品有效的MMD正则化定义适当的指标,则是一个空旷的问题。在这项工作中,我们不仅弥合了这一差距,而且进一步考虑了基于积分概率指标(IPM)的通用正规化家族,其中包括MMD作为特殊情况。我们提出了新颖的IPM正规化$ P $ - WASSERSTEIN风格的OT配方,并证明它们确实诱导了指标。尽管其中一些新型指标可以解释为IPM的虚拟卷积,但有趣的是,事实证明是GW和GHK指标的IPM-Analogues。最后,我们提出了基于样品的有限公式,用于估计平方-MMD正则化度量和相应的barycenter。我们从经验上研究了拟议指标的其他理想特性,并显示了它们在各种机器学习应用中的适用性。
translated by 谷歌翻译
We study the problem of preserving privacy while still providing high utility in sequential decision making scenarios in a changing environment. We consider abruptly changing environment: the environment remains constant during periods and it changes at unknown time instants. To formulate this problem, we propose a variant of multi-armed bandits called non-stationary stochastic corrupt bandits. We construct an algorithm called SW-KLUCB-CF and prove an upper bound on its utility using the performance measure of regret. The proven regret upper bound for SW-KLUCB-CF is near-optimal in the number of time steps and matches the best known bound for analogous problems in terms of the number of time steps and the number of changes. Moreover, we present a provably optimal mechanism which can guarantee the desired level of local differential privacy while providing high utility.
translated by 谷歌翻译
The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.
translated by 谷歌翻译
People living with dementia often exhibit behavioural and psychological symptoms of dementia that can put their and others' safety at risk. Existing video surveillance systems in long-term care facilities can be used to monitor such behaviours of risk to alert the staff to prevent potential injuries or death in some cases. However, these behaviours of risk events are heterogeneous and infrequent in comparison to normal events. Moreover, analyzing raw videos can also raise privacy concerns. In this paper, we present two novel privacy-protecting video-based anomaly detection approaches to detect behaviours of risks in people with dementia. We either extracted body pose information as skeletons and use semantic segmentation masks to replace multiple humans in the scene with their semantic boundaries. Our work differs from most existing approaches for video anomaly detection that focus on appearance-based features, which can put the privacy of a person at risk and is also susceptible to pixel-based noise, including illumination and viewing direction. We used anonymized videos of normal activities to train customized spatio-temporal convolutional autoencoders and identify behaviours of risk as anomalies. We show our results on a real-world study conducted in a dementia care unit with patients with dementia, containing approximately 21 hours of normal activities data for training and 9 hours of data containing normal and behaviours of risk events for testing. We compared our approaches with the original RGB videos and obtained an equivalent area under the receiver operating characteristic curve performance of 0.807 for the skeleton-based approach and 0.823 for the segmentation mask-based approach. This is one of the first studies to incorporate privacy for the detection of behaviours of risks in people with dementia.
translated by 谷歌翻译
Federated Deep Learning frameworks can be used strategically to monitor Land Use locally and infer environmental impacts globally. Distributed data from across the world would be needed to build a global model for Land Use classification. The need for a Federated approach in this application domain would be to avoid transfer of data from distributed locations and save network bandwidth to reduce communication cost. We use a Federated UNet model for Semantic Segmentation of satellite and street view images. The novelty of the proposed architecture is the integration of Knowledge Distillation to reduce communication cost and response time. The accuracy obtained was above 95% and we also brought in a significant model compression to over 17 times and 62 times for street View and satellite images respectively. Our proposed framework has the potential to be a game-changer in real-time tracking of climate change across the planet.
translated by 谷歌翻译
We introduce SketchySGD, a stochastic quasi-Newton method that uses sketching to approximate the curvature of the loss function. Quasi-Newton methods are among the most effective algorithms in traditional optimization, where they converge much faster than first-order methods such as SGD. However, for contemporary deep learning, quasi-Newton methods are considered inferior to first-order methods like SGD and Adam owing to higher per-iteration complexity and fragility due to inexact gradients. SketchySGD circumvents these issues by a novel combination of subsampling, randomized low-rank approximation, and dynamic regularization. In the convex case, we show SketchySGD with a fixed stepsize converges to a small ball around the optimum at a faster rate than SGD for ill-conditioned problems. In the non-convex case, SketchySGD converges linearly under two additional assumptions, interpolation and the Polyak-Lojaciewicz condition, the latter of which holds with high probability for wide neural networks. Numerical experiments on image and tabular data demonstrate the improved reliability and speed of SketchySGD for deep learning, compared to standard optimizers such as SGD and Adam and existing quasi-Newton methods.
translated by 谷歌翻译
我们研究了在约束强化学习中有效探索的后验抽样方法。或者,对于现有算法,我们提出了两种简单的算法,这些算法在统计上更有效,更简单地实现和计算便宜。第一种算法基于CMDP的线性公式,第二算法利用CMDP的鞍点公式。我们的经验结果表明,尽管具有简单性,但后取样可实现最先进的表现,在某些情况下,采样明显优于乐观算法。
translated by 谷歌翻译
更多数据有助于我们推广到任务。但是实际数据集可以包含分布(OOD)数据;这可以以异质性的形式出现,例如类内变异性,也可以以时间变化或概念漂移的形式出现。我们在此类问题上展示了一种反直觉现象:任务的概括误差可能是OOD样本数量的非单调函数;少数OOD样品可以改善概括,但是如果OOD样品的数量超出了阈值,则概括误差可能会恶化。我们还表明,如果我们知道哪些样品是OOD,则使用目标和OOD样品之间的加权目标确保概括误差单调减少。我们使用线性分类器在CIFAR-10上的合成数据集和中型神经网络上使用线性分类器演示和分析了此问题。
translated by 谷歌翻译