Artículo
Demographically-Informed Prediction Discrepancy Index: Early Warnings of Demographic Biases for Unlabeled Populations
Mansilla, Lucas Andrés
; Claucich, Estanislao
; Echeveste, Rodrigo Sebastián
; Milone, Diego Humberto
; Ferrante, Enzo
; Claucich, Estanislao
; Echeveste, Rodrigo Sebastián
; Milone, Diego Humberto
; Ferrante, Enzo
Fecha de publicación:
02/2024
Editorial:
MIT Press
Revista:
Transactions on Machine Learning Research
ISSN:
2835-8856
Idioma:
Inglés
Tipo de recurso:
Artículo publicado
Clasificación temática:
Resumen
An ever-growing body of work has shown that machine learning systems can be systematically biased against certain sub-populations defined by attributes like race or gender.Data imbalance and under-representation of certain populations in the training datasetshave been identified as potential causes behind this phenomenon. However, understandingwhether data imbalance with respect to a specific demographic group may result in biasesfor a given task and model class is not simple. An approach to answering this question isto perform controlled experiments, where several models are trained with different imbalance ratios and then their performance is evaluated on the target population. However,in the absence of ground-truth annotations at deployment for an unseen population, mostfairness metrics cannot be computed. In this work, we explore an alternative method tostudy potential bias issues based on the output discrepancy of pools of models trained ondifferent demographic groups. Models within a pool are otherwise identical in terms ofarchitecture, hyper-parameters, and training scheme. Our hypothesis is that the outputconsistency between models may serve as a proxy to anticipate biases concerning demographic groups. In other words, if models tailored to different demographic groups produceinconsistent predictions, then biases are more prone to appear in the task under analysis. We formulate the Demographically-Informed Prediction Discrepancy Index (DIPDI)and validate our hypothesis in numerical experiments using both synthetic and real-worlddatasets. Our work sheds light on the relationship between model output discrepancy anddemographic biases and provides a means to anticipate potential bias issues in the absenceof ground-truth annotations. Indeed, we show how DIPDI could provide early warningsabout potential demographic biases when deploying machine learning models on new andunlabeled populations that exhibit demographic shifts.
Palabras clave:
Biases
,
Unsupervised Methods
,
Machine Learning
Archivos asociados
Licencia
Identificadores
Colecciones
Articulos(SINC(I))
Articulos de INST. DE INVESTIGACION EN SEÑALES, SISTEMAS E INTELIGENCIA COMPUTACIONAL
Articulos de INST. DE INVESTIGACION EN SEÑALES, SISTEMAS E INTELIGENCIA COMPUTACIONAL
Citación
Mansilla, Lucas Andrés; Claucich, Estanislao; Echeveste, Rodrigo Sebastián; Milone, Diego Humberto; Ferrante, Enzo; Demographically-Informed Prediction Discrepancy Index: Early Warnings of Demographic Biases for Unlabeled Populations; MIT Press; Transactions on Machine Learning Research; 2-2024; 1-24
Compartir