Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

GPU parallelization of a hybrid pseudospectral geophysical turbulence framework using CUDA

Rosenberg, Duane; Mininni, Pablo DanielIcon ; Reddy, Raghu; Pouquet, Annick
Fecha de publicación: 02/2020
Editorial: Molecular Diversity Preservation International
Revista: Atmosphere
ISSN: 2073-4433
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Meteorología y Ciencias Atmosféricas; Ciencias de la Computación

Resumen

An existing hybrid MPI-OpenMP scheme is augmented with a CUDA-based fine grain parallelization approach for multidimensional distributed Fourier transforms, in a well-characterized pseudospectral fluid turbulence code. Basics of the hybrid scheme are reviewed, and heuristics provided to show a potential benefit of the CUDA implementation. The method draws heavily on the CUDA runtime library to handle memory management and on the cuFFT library for computing local FFTs. The manner in which the interfaces to these libraries are constructed, and ISO bindings utilized to facilitate platform portability, are discussed. CUDA streams are implemented to overlap data transfer with cuFFT computation. Testing with a baseline solver demonstrated significant aggregate speed-up over the hybrid MPI-OpenMP solver by offloading to GPUs on an NVLink-based test system. While the batch streamed approach provided little benefit with NVLink, we saw a performance gain of 30% when tuned for the optimal number of streams on a PCIe-based system. It was found that strong GPU scaling is nearly ideal, in all cases. Profiling of the CUDA kernels shows that the transform computation achieves 15% of the attainable peak FlOp-rate based on a roofline model for the system. In addition to speed-up measurements for the fiducial solver, we also considered several other solvers with different numbers of transform operations and found that aggregate speed-ups are nearly constant for all solvers.
Palabras clave: COMPUTATIONAL FLUIDS , CUDA , GPU , MPI , NUMERICAL SIMULATION , OPENMP , PARALLEL COMPUTING
Ver el registro completo
 
Archivos asociados
Thumbnail
 
Tamaño: 638.3Kb
Formato: PDF
.
Descargar
Licencia
info:eu-repo/semantics/openAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Unported (CC BY-NC-SA 2.5)
Identificadores
URI: http://hdl.handle.net/11336/146032
URL: https://www.mdpi.com/2073-4433/11/2/178
DOI: http://dx.doi.org/10.3390/atmos11020178
Colecciones
Articulos(IFIBA)
Articulos de INST.DE FISICA DE BUENOS AIRES
Citación
Rosenberg, Duane; Mininni, Pablo Daniel; Reddy, Raghu; Pouquet, Annick; GPU parallelization of a hybrid pseudospectral geophysical turbulence framework using CUDA; Molecular Diversity Preservation International; Atmosphere; 11; 2; 2-2020; 1-22
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES