Mostrar el registro sencillo del ítem

dc.contributor.author
Fornaciari, Tommaso  
dc.contributor.author
Cagnina, Leticia Cecilia  
dc.contributor.author
Rosso, Paolo  
dc.contributor.author
Poesio, Massimo  
dc.date.available
2021-10-14T18:37:36Z  
dc.date.issued
2020-12  
dc.identifier.citation
Fornaciari, Tommaso; Cagnina, Leticia Cecilia; Rosso, Paolo; Poesio, Massimo; Fake opinion detection: how similar are crowdsourced datasets to real data?; Springer; Language Resources And Evaluation; 54; 4; 12-2020; 1019-1058  
dc.identifier.issn
1574-020X  
dc.identifier.uri
http://hdl.handle.net/11336/143656  
dc.description.abstract
Identifying deceptive online reviews is a challenging tasks for Natural Language Processing (NLP). Collecting corpora for the task is difficult, because normally it is not possible to know whether reviews are genuine. A common workaround involves collecting (supposedly) truthful reviews online and adding them to a set of deceptive reviews obtained through crowdsourcing services. Models trained this way are generally successful at discriminating between ‘genuine’ online reviews and the crowdsourced deceptive reviews. It has been argued that the deceptive reviews obtained via crowdsourcing are very different from real fake reviews, but the claim has never been properly tested. In this paper, we compare (false) crowdsourced reviews with a set of ‘real’ fake reviews published on line. We evaluate their degree of similarity and their usefulness in training models for the detection of untrustworthy reviews. We find that the deceptive reviews collected via crowdsourcing are significantly different from the fake reviews published online. In the case of the artificially produced deceptive texts, it turns out that their domain similarity with the targets affects the models’ performance, much more than their untruthfulness. This suggests that the use of crowdsourced datasets for opinion spam detection may not result in models applicable to the real task of detecting deceptive reviews. As an alternative method to create large-size datasets for the fake reviews detection task, we propose methods based on the probabilistic annotation of unlabeled texts, relying on the use of meta-information generally available on the e-commerce sites. Such methods are independent from the content of the reviews and allow to train reliable models for the detection of fake reviews.  
dc.format
application/pdf  
dc.language.iso
eng  
dc.publisher
Springer  
dc.rights
info:eu-repo/semantics/restrictedAccess  
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/  
dc.subject
CROWDSOURCING  
dc.subject
DECEPTION DETECTION  
dc.subject
GROUND TRUTH  
dc.subject
PROBABILISTIC LABELING  
dc.subject.classification
Ciencias de la Computación  
dc.subject.classification
Ciencias de la Computación e Información  
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS  
dc.title
Fake opinion detection: how similar are crowdsourced datasets to real data?  
dc.type
info:eu-repo/semantics/article  
dc.type
info:ar-repo/semantics/artículo  
dc.type
info:eu-repo/semantics/publishedVersion  
dc.date.updated
2020-08-05T16:39:51Z  
dc.identifier.eissn
1574-0218  
dc.journal.volume
54  
dc.journal.number
4  
dc.journal.pagination
1019-1058  
dc.journal.pais
Alemania  
dc.journal.ciudad
Berlín  
dc.description.fil
Fil: Fornaciari, Tommaso. Università Bocconi; Italia  
dc.description.fil
Fil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina  
dc.description.fil
Fil: Rosso, Paolo. Universidad Politécnica de Valencia; España  
dc.description.fil
Fil: Poesio, Massimo. Queen Mary University of London; Reino Unido  
dc.journal.title
Language Resources And Evaluation  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://link.springer.com/article/10.1007%2Fs10579-020-09486-5  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/https://doi.org/10.1007/s10579-020-09486-5