Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

Marriage between variable selection and prediction methods to model plant disease risk

Suarez, Franco MarceloIcon ; Bruno, Cecilia InesIcon ; Giannini Kurina, FrancaIcon ; Giménez Pecci, M. Paz; Rodríguez-Pardina, Patricia Elsa; Balzarini, Monica GracielaIcon
Fecha de publicación: 11/2023
Editorial: Elsevier Science
Revista: European Journal of Agronomy
ISSN: 1161-0301
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Otras Ciencias Agrícolas

Resumen

Predicting the risk of a disease in a pathosystem based on a set of climatic variables usually requires handling a high number of input variables, many of which are often irrelevant and/or redundant. Building linear predictive models entails not only dimensionality issues but also the negative impact of multicollinearity. Several feature selection methods have proved to be efficient in both linear and non-linear models, regardless of those issues. However, in a machine learning (ML) context, it is necessary to evaluate these feature selection methods embedded into the model fitting algorithm to obtain the greatest accuracy. The aim of this work was to assess different combinations of variable selection methods with linear and non-linear predictors to fit climate-based models that predict the occurrence of a disease in a pathosystem. Four selection methods were compared: stepwise, which is frequently used in linear models, combined with VIF and p-value statistical criteria (Step+VIF+Pv), and other methods commonly used in ML: filter (F), genetic algorithm (GA), and Boruta (B). The disease risk predictors were constructed with a logistic linear regression model (LR) and the random forest (RF) algorithm, using all the available variables and the subgroups of variables selected by each feature selection method. Data from three pathosystems were processed: two involving Begomovirus –one in common bean (Phaseolus vulgaris L) and the other in soybean (Glycine max)– and the third one involving Mal de Rio Cuarto virus in maize (Zea mays L.). The data sets differed in sample size and number of variables. The accuracy of RF prediction did not vary among feature selection methods. Step+VIF+Pv was used to reduce the model outperformed the other feature selection methods in fitting LR. Our proposal suggests that the appropriate pairing of variable selection and prediction models would improve the modeling of plant disease risk.
Palabras clave: FEATURE SELECTION , LOGISTIC REGRESSION , MULTICOLLINEARITY , PATHOSYSTEMS , PREDICTION MODELS , RANDOM FOREST
Ver el registro completo
 
Archivos asociados
Tamaño: 3.547Mb
Formato: PDF
.
Solicitar
Licencia
info:eu-repo/semantics/restrictedAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Atribución-NoComercial-SinDerivadas 2.5 Argentina (CC BY-NC-ND 2.5 AR)
Identificadores
URI: http://hdl.handle.net/11336/226616
URL: https://www.sciencedirect.com/science/article/pii/S1161030123002630
DOI: https://doi.org/10.1016/j.eja.2023.126995
Colecciones
Articulos (UFYMA)
Articulos de UNIDAD DE FITOPATOLOGIA Y MODELIZACION AGRICOLA
Citación
Suarez, Franco Marcelo; Bruno, Cecilia Ines; Giannini Kurina, Franca; Giménez Pecci, M. Paz; Rodríguez-Pardina, Patricia Elsa; et al.; Marriage between variable selection and prediction methods to model plant disease risk; Elsevier Science; European Journal of Agronomy; 151; 126995; 11-2023; 1-12
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES