ML-fairness-gym: A Tool for Exploring Long-Term Impacts of Machine Learning Systems - Deepstash
Machine Learning With Google

Learn more about artificialintelligence with this collection

Understanding machine learning models

Improving data analysis and decision-making

How Google uses logic in machine learning

Machine Learning With Google

Discover 95 similar ideas in

It takes just

14 mins to read

Assessment Methods

Common methods for assessing the fairness of machine learning systems involve evaluating disparities in error metrics on static datasets for various inputs to the system.

Indeed, many existing ML fairness toolkits (e.g., AIF360, fairlearn, fairness-indicators, fairness-comparison) provide tools for performing such error-metric based analysis on existing datasets.

15

254 reads

The Anomalies

There are cases (e.g., systems with active data collection or significant feedback loops) where the context in which the algorithm operates is critical for understanding its impact. In these cases, the fairness of algorithmic decisions ideally would be analyzed with greater consideration for the environmental and temporal context than error metric-based techniques allow.

14

201 reads

ML-Fairness Gym

In order to facilitate algorithmic development with this broader context, we have released ML-fairness-gym, a set of components for building simple simulations that explore potential long-run impacts of deploying machine learning-based decision systems in social environments. In “Fairness is not Static: Deeper Understanding of Long Term Fairness via Simulation Studies” we demonstrate how the ML-fairness-gym can be used to research the long-term effects of automated decision systems on a number of established problems from current machine learning fairness literature.

17

184 reads

Deficiencies in Static Dataset Analysis

A standard practice in machine learning to assess the impact of a scenario like the lending problem is to reserve a portion of the data as a “test set”, and use that to calculate relevant performance metrics. Fairness is then assessed by looking at how those performance metrics differ across salient groups. However, it is well understood that there are two main issues with using test sets like this in systems with feedback. If test sets are generated from existing systems, they may be incomplete or reflect the biases inherent to those systems.

14

163 reads

ML-fairness-gym as a Simulation Tool for Long-Term Analysis

The ML-fairness-gym simulates sequential decision making using Open AI’s Gym framework. In this framework, agents interact with simulated environments in a loop. At each step, an agent chooses an action that then affects the environment’s state. The environment then reveals an observation that the agent uses to inform its subsequent actions. In this framework, environments model the system and dynamics of the problem and observations serve as data to the agent, which can be encoded as a machine learning system.

16

164 reads

Extending the Analysis to the Long-Term

Since Liu et al.’s original formulation of the lending problem examined only the short-term consequences of the bank’s policies — including short-term profit-maximizing policies (called the max reward agent) and policies subject to an equality of opportunity (EO) constraint — we use the ML-fairness-gym to extend the analysis to the long-term (many steps) via simulation.

14

165 reads

The Results

Our long-term analysis found two results. First, as found by Liu et al., the equal opportunity agent (EO agent) overlends to the disadvantaged group (group 2, which initially has a lower average credit score) by sometimes applying a lower threshold for the group than would be applied by the max reward agent.

Second, equal opportunity constraints — enforcing equalized TPR between groups at each step — does not equalize TPR in aggregate over the simulation.

14

157 reads

Conclusion

Our paper extends the analysis of two other scenarios that have been previously studied in the academic ML fairness literature. The ML-fairness-gym framework is also flexible enough to simulate and explore problems where “fairness” is under-explored. For example, in a supporting paper, “Fair treatment allocations in social networks,” we explore a stylized version of epidemic control, which we call the precision disease control problem, to better understand notions of fairness across individuals and communities in a social network.

16

156 reads

CURATED BY

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving & library

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Personalized recommendations

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates