智能论文笔记

Fair Ranking with Noisy Protected Attributes

Anay Mehrotra , Nisheeth K. Vishnoi

分类：机器学习 | (统计)机器学习

2022-11-30

The fair-ranking problem, which asks to rank a given set of items to maximize utility subject to group fairness constraints, has received attention in the fairness, information retrieval, and machine learning literature. Recent works, however, observe that errors in socially-salient (including protected) attributes of items can significantly undermine fairness guarantees of existing fair-ranking algorithms and raise the problem of mitigating the effect of such errors. We study the fair-ranking problem under a model where socially-salient attributes of items are randomly and independently perturbed. We present a fair-ranking framework that incorporates group fairness requirements along with probabilistic information about perturbations in socially-salient attributes. We provide provable guarantees on the fairness and utility attainable by our framework and show that it is information-theoretically impossible to significantly beat these guarantees. Our framework works for multiple non-disjoint attributes and a general class of fairness constraints that includes proportional and equal representation. Empirically, we observe that, compared to baselines, our algorithm outputs rankings with higher fairness, and has a similar or better fairness-utility trade-off compared to baselines.

translated by 谷歌翻译

Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson Brownian Motion

Oren Mangoubi , Nisheeth K. Vishnoi

分类：机器学习 | (统计)机器学习

2022-11-11

Given a symmetric matrix $M$ and a vector $\lambda$, we present new bounds on the Frobenius-distance utility of the Gaussian mechanism for approximating $M$ by a matrix whose spectrum is $\lambda$, under $(\varepsilon,\delta)$-differential privacy. Our bounds depend on both $\lambda$ and the gaps in the eigenvalues of $M$, and hold whenever the top $k+1$ eigenvalues of $M$ have sufficiently large gaps. When applied to the problems of private rank-$k$ covariance matrix approximation and subspace recovery, our bounds yield improvements over previous bounds. Our bounds are obtained by viewing the addition of Gaussian noise as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations allow us to bound the utility as the square-root of a sum-of-squares of perturbations to the eigenvectors, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems.

translated by 谷歌翻译

Private Matrix Approximation and Geometry of Unitary Orbits

Oren Mangoubi , Yikai Wu , Satyen Kale , Abhradeep Guha Thakurta , Nisheeth K. Vishnoi

分类：机器学习 | (统计)机器学习

2022-07-06

考虑以下优化问题：给定$ n \ times n $矩阵$ a $和$ \ lambda $，最大化$ \ langle a，u \ lambda u^*\ rangle $，其中$ u $ $ u $在unital Group $ \ mathrm上变化{u}（n）$。这个问题试图通过矩阵大约$ a $，其频谱与$ \ lambda $相同，并且通过将$ \ lambda $设置为适当的对角矩阵，可以恢复矩阵近似问题，例如pca和等级$ k $近似。我们研究了在使用用户的私人数据构建矩阵$ a $的设置中，为这种优化问题设计差异化私有算法的问题。我们给出有效的私有算法，在近似误差上带有上和下限。我们的结果统一并改进了有关私人矩阵近似问题的几项先前的作品。他们依靠格拉斯曼尼亚人的包装/覆盖数量范围扩展到应该具有独立利益的单一轨道。

translated by 谷歌翻译

Faster Sampling from Log-Concave Distributions over Polytopes via a Soft-Threshold Dikin Walk

Oren Mangoubi , Nisheeth K. Vishnoi

分类：机器学习 | (统计)机器学习

2022-06-19

我们考虑从$ d $ d $二维log-concave分发进行抽样的问题。我们的主要结果是Dikin Walk Markov链的“软阈值”变体，最多需要$ o（（（md + d l^2 r^2）\ times md^{\ omega-1}）\ log（ \ frac {w} {\ delta}））$算术操作以从$ \ pi $中采样错误$ \ delta> 0 $在与$ w $ -warm启动的总变化距离中，其中$ l $是lipschitz - $ f $，$ k $包含在半径$ r $的球中，包含一个较小半径$ r $的球，而$ \ omega $是矩阵 - multiplication常数。当没有温暖的开始时，这意味着改进了$ \ tilde {o}（d^{3.5- \ omega}）$ arithmetic操作，以前从$ \ pi $采样中，在总变化错误$ \ delta $中采样，这是通过获得的在$ k $中，$ m = o（d）$不等式和$ lr = o（\ sqrt {d}）$。我们的算法在此环境中最佳以前的界限上提高了$ d^2 $算术操作，这是针对其他vers获得的Dikin Walk算法的离子。将我们的Dikin Walk Markov链插入Mangoubi和Vishnoi（2021）的后处理算法，我们在运行时间的依赖性方面取得了进一步的改进当$ k $是多层人士时。

translated by 谷歌翻译

Selection in the Presence of Implicit Bias: The Advantage of Intersectional Constraints

Anay Mehrotra , Bary S. R. Pradelski , Nisheeth K. Vishnoi

分类：人工智能 | 机器学习 | (统计)机器学习

2022-02-03

在招聘，晋升和大学录取等选择过程中，众所周知，候选人的种族，性别或性取向等社会质量属性的隐性偏见会造成持久的不平等，并减少决策者的总效用。已经提出了诸如鲁尼规则及其概括之类的干预措施，这些干预措施要求决策者至少选择每个受影响组的指定数量的个体，以减轻隐性偏见在选择中的不利影响。最近的工作已经确定，在每个人最多属于一个受影响的群体的情况下，这种较低的约束对于改善总效用可能非常有效。但是，在某些情况下，个人可能属于多个受影响的群体，因此，由于这种交叉性，面临更大的隐含偏见。我们考虑独立绘制的实用程序，并表明在相交的情况下，上述非交流约束只能在没有隐性偏见的情况下恢复可实现的总效用的一部分。另一方面，我们表明，如果一个人在交叉点上包含适当的下限约束，那么在没有隐式偏见的情况下，几乎所有实用程序都可以恢复。因此，相交的约束可以比减少尺寸的非相互作用方法可提供显着优势，以减少不平等。

translated by 谷歌翻译

Fairness for AUC via Feature Augmentation

Hortense Fong , Vineet Kumar , Anay Mehrotra , Nisheeth K. Vishnoi

分类：机器学习 | 人工智能 | (统计)机器学习

2021-11-24

我们在分类的背景下研究公平，其中在接收器的曲线下的区域（AUC）下的区域测量的性能。当I型（误报）和II型（假阴性）错误都很重要时，通常使用AUC。然而，相同的分类器可以针对不同的保护组具有显着变化的AUC，并且在现实世界中，通常希望减少这种交叉组差异。我们解决如何选择其他功能，以便最大地改善弱势群体的AUC。我们的结果表明，功能的无条件方差不会通知我们关于AUC公平，而是类条件方差。使用此连接，我们基于功能增强（添加功能）来开发一种新颖的方法Fairauc，以减轻可识别组之间的偏差。我们评估综合性和现实世界（Compas）数据集的Fairauc，并发现它对于相对于基准，最大限度地提高了总体AUC并最大限度地减少了组之间的偏见的基准，它显着改善了弱势群体的AUC。

translated by 谷歌翻译

Sampling from Log-Concave Distributions with Infinity-Distance Guarantees and Applications to Differentially Private Optimization

Oren Mangoubi , Nisheeth K. Vishnoi

分类：机器学习 | (统计)机器学习

2021-11-07

对于$ d $ -dimensional log-tand distribus $ \ pi（\ theta）\ propto e ^ { - f（\ theta）} $ on polytope $ k $，我们考虑从分发$ \输出样本的问题nu $哪个是$ o（\ varepsilon）$ - 关闭无限距离$ \ sup _ {\ theta \在k} | \ log \ frac {\ nu（\ theta）} {\ pi（\ theta）} | $ $ \ pi $。具有无限远距离保证的这种采样器对于差异私密优化，特别是具有总变化距离或KL发散界限的传统采样算法不足以保证差别隐私。我们的主要结果是一种从分发$ o（\ varepsilon）$的点输出点 - 接近$ \ pi $ infifinity - 距离，需要$ o（（md + dl ^ 2r ^ 2）\ times（lr + d \ log（\ frac {rd + lrd} {\ varepsilon r}））\ times md ^ {\ omega-1}）$算术运算，其中$ f $ in $ l $ -lipschitz，$ k $由$ m $不平等，包含在半径$ r $的球中，并包含一个较小的半径$ $ $ $ r $，$ \ omega $是矩阵乘法常量。特别地，此运行时是在$ \ FRAC {1} {\ varepsilon} $中的对数，并显着提高了先前的作品。从技术上讲，我们从先前的作品离开，在$ \ frac {1} {\ varepsilon ^ 2} $ - 以$ k $的离散化，以获得$ o（\ varepsilon）$无限距离错误，并呈现一种方法将连续样本从$ k $转换为与具有无限界限的样本到样本。为了实现对$ D $的改进依赖，我们提出了一个“软阈值”版本的Dikin Walk，可能是独立的兴趣。将我们的算法插入指数机制的框架，在$ \ varepsilon $ -pure的运行时间内输出类似的改进，用于优化问题，例如Lipschitz-convex功能的经验风险和低秩近似，同时仍然实现最有名的已知效用范围。

translated by 谷歌翻译

Fair Classification with Adversarial Perturbations

L. Elisa Celis , Anay Mehrotra , Nisheeth K. Vishnoi

分类：机器学习 | 人工智能 | (统计)机器学习

2021-06-10

我们在禁用的对手存在下研究公平分类，允许获得$ \ eta $，选择培训样本的任意$ \ eta $ -flaction，并任意扰乱受保护的属性。由于战略误报，恶意演员或归责的错误，受保护属性可能不正确的设定。和现有的方法，使随机或独立假设对错误可能不满足其在这种对抗环境中的保证。我们的主要贡献是在这种对抗的环境中学习公平分类器的优化框架，这些普遍存在的准确性和公平性提供了可证明的保证。我们的框架适用于多个和非二进制保护属性，专为大类线性分数公平度量设计，并且还可以处理除了受保护的属性之外的扰动。我们证明了我们框架的近密性，对自然假设类别的保证：没有算法可以具有明显更好的准确性，并且任何具有更好公平性的算法必须具有较低的准确性。凭经验，我们评估了我们对统计率的统计税务统计税率为一个对手的统计税率产生的分类机。

translated by 谷歌翻译

A Convergent and Dimension-Independent Min-Max Optimization Algorithm

Vijay Keswani , Oren Mangoubi , Sushant Sachdeva , Nisheeth K. Vishnoi

分类：机器学习 | (统计)机器学习

2020-06-22

我们研究了最近引入的最低最大优化框架的一种变体，其中最大玩具被限制以贪婪的方式更新其参数，直到达到一阶固定点为止。我们对此框架的平衡定义取决于最小玩家使用该方向来更新其参数的方向的提案分布。我们表明，鉴于一个平稳且有界的非Convex-Nonconcave目标函数，访问Min-player的更新的任何提案分布以及最大播放器的随机梯度甲骨文，我们的算法收敛于上述近似近似近似局部平衡，以众多的局部平衡。不取决于维度的迭代。我们的算法发现的平衡点取决于提议分布，在应用我们的算法来训练gans时，我们选择提案分布作为随机梯度的分布。我们从经验上评估了我们的算法，以挑战非凸孔测试功能和GAN培训中引起的损失功能。我们的算法在这些测试功能上收敛，并在用于训练gans时会在合成和现实世界中稳定训练，并避免模式崩溃

translated by 谷歌翻译

Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

James K. He , Sofía S. Villar , Lida Mavrogonatou

分类：机器学习

2023-01-03

Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

translated by 谷歌翻译