Mostrar el registro sencillo del ítem

dc.contributor.author
Alemany, Laura Alonso  
dc.contributor.author
Benotti, Luciana  
dc.contributor.author
Maina, Hernán Javier  
dc.contributor.author
Lucía Gonzalez  
dc.contributor.author
Rajngewerc, Mariela  
dc.contributor.author
Martínez, Lautaro  
dc.contributor.author
Sánchez, Jorge  
dc.contributor.author
Schilman, Mauro  
dc.contributor.author
Ivetta, Guido  
dc.contributor.author
Halvorsen, Alexia  
dc.contributor.author
Mata Rojo, Amanda  
dc.contributor.author
Bordon, Matías  
dc.contributor.author
Busaniche, Beatriz  
dc.date.available
2023-12-01T14:41:13Z  
dc.date.issued
2023-03  
dc.identifier.citation
Alemany, Laura Alonso; Benotti, Luciana; Maina, Hernán Javier; Lucía Gonzalez; Rajngewerc, Mariela; et al.; A methodology to characterize bias and harmful stereotypes in natural language processing in Latin America; Cornell University; arXiv; 3-2023; 1-24  
dc.identifier.issn
2331-8422  
dc.identifier.uri
http://hdl.handle.net/11336/218993  
dc.description.abstract
Automated decision-making systems, specially those based on natural language processing, are pervasive in our lives. They are not only behind the internet search engines we use daily, but also take more critical roles: selecting candidates for a job, determining suspects of a crime, diagnosing autism and more. Such automated systems make errors, which may be harmful in many ways, be it because of the severity of the consequences (as in health issues) or because of the sheer number of people they affect. When errors made by an automated system affect a population more than other, we call the system biased.Most modern natural language technologies are based on artifacts obtained from enormous volumes of text using machine learning, namely language models and word embeddings. Since they are created applying subsymbolic machine learning, mostly artificial neural networks, they are opaque and practically uninterpretable by direct inspection, thus making it very difficult to audit them.In this paper we present a methodology that spells out how social scientists, domain experts, and machine learning experts can collaboratively explore biases and harmful stereotypes in word embeddings and large language models. Our methodology is based on the following principles:1. focus on the linguistic manifestations of discrimination on word embeddings and language models, not on the mathematical properties of the models2. reduce the technical barrier for discrimination experts3. characterize through a qualitative exploratory process in addition to ametric-based approach4. address mitigation as part of the training process, not as an after thought.  
dc.format
application/pdf  
dc.language.iso
eng  
dc.publisher
Cornell University  
dc.rights
info:eu-repo/semantics/openAccess  
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/  
dc.subject
Natural Language Processing  
dc.subject
Language models  
dc.subject
Bias  
dc.subject
Stereotypes and Discrimination  
dc.subject.classification
Otras Ciencias de la Computación e Información  
dc.subject.classification
Ciencias de la Computación e Información  
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS  
dc.title
A methodology to characterize bias and harmful stereotypes in natural language processing in Latin America  
dc.type
info:eu-repo/semantics/article  
dc.type
info:ar-repo/semantics/artículo  
dc.type
info:eu-repo/semantics/publishedVersion  
dc.date.updated
2023-11-28T14:57:11Z  
dc.journal.pagination
1-24  
dc.journal.pais
Estados Unidos  
dc.journal.ciudad
Cornell  
dc.description.fil
Fil: Alemany, Laura Alonso. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Fundación Via Libre; Argentina  
dc.description.fil
Fil: Benotti, Luciana. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Fundación Via Libre; Argentina. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física. Sección Física; Argentina  
dc.description.fil
Fil: Maina, Hernán Javier. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Fundación Via Libre; Argentina. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina  
dc.description.fil
Fil: Lucía Gonzalez. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Fundación Via Libre; Argentina  
dc.description.fil
Fil: Rajngewerc, Mariela. Fundación Via Libre; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física. Sección Ciencias de la Computación; Argentina  
dc.description.fil
Fil: Martínez, Lautaro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Fundación Via Libre; Argentina  
dc.description.fil
Fil: Sánchez, Jorge. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina  
dc.description.fil
Fil: Schilman, Mauro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina  
dc.description.fil
Fil: Ivetta, Guido. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina  
dc.description.fil
Fil: Halvorsen, Alexia. Fundación Via Libre; Argentina  
dc.description.fil
Fil: Mata Rojo, Amanda. Fundación Via Libre; Argentina  
dc.description.fil
Fil: Bordon, Matías. Fundación Via Libre; Argentina. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina  
dc.description.fil
Fil: Busaniche, Beatriz. Fundación Via Libre; Argentina  
dc.journal.title
arXiv  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/2207.06591v3  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/https://doi.org/10.48550/arXiv.2207.06591