Portabilidade e Eficiência do Método Fletcher de Aplicações Sísmicas em Arquiteturas Multicore e GPU

  • Matheus Serpa Universidade Federal do Rio Grande do Sul
  • Pablo José Pavan Universidade Federal do Rio Grande do Sul
  • Jairo Panetta Instituto Tecnológico de Aeronáutica
  • Antônio Azambuja Petrobras
  • Alexandre Carissimi Universidade Federal do Rio Grande do Sul
  • Philippe Olivier Navaux Universidade Federal do Rio Grande do Sul

Abstract


The simulation of acoustic wave propagation is the backbone of seismic imaging tools used by the oil and gas industries. To perform such simulations, HPC architectures are employed, generating faster and more accurate results with each processor generation. However, to achieve high performance in these architectures, several challenges must be taken into account when developing the application. In this article, we optimized the Fletcher Method for multicore and GPU and evaluated the performance, energy consumption, and energy efficiency of eight versions of the code. The results show that the CUDA version has the best performance and energy efficiency; however, the OpenACC version that has the advantage of portability has a performance and energy efficiency degradation of only 10% and 8% compared with CUDA.

References

Andreolli, C., Thierry, P., Borges, L., Skinner, G., and Yount, C. (2015). Characterization and Optimization Methodology Applied to Stencil Computations. In Reinders, J. and Jeffers, J., editors, High Performance Parallelism Pearls, pages 377–396. Morgan Kaufmann, Boston.

Caballero, D., Farrés, A., Duran, A., Hanzich, M., Fernández, S., and Martorell, X. (2015). Optimizing Fully Anisotropic Elastic Propagation on Intel Xeon Phi Coprocessors. In 2nd EAGE Workshop on HPC for Upstream, pages 1–6.

Carrijo Nasciutti, T., Panetta, J., and Pais Lopes, P. (2018). Evaluating optimizations that reduce global memory accesses of stencil computations in gpgpus. Concurrency and Computation: Practice and Experience, page e4929.

Castro, M., Francesquini, E., Dupros, F., Aochi, H., Navaux, P. O. A., and Méhaut, J.-F. (2016). Seismic wave propagation simulations on low-power and performance-centric manycores. Parallel Computing, 54.

Chandra, R., Dagum, L., Kohr, D., Menon, R., Maydan, D., and McDonald, J. (2001) Parallel programming in OpenMP. Morgan kaufmann.

Clapp, R. G. (2015). Seismic Processing and the Computer Revolution(s). In Society of Exploration Geophysicists (SEG) Technical Program Expanded Abstracts 2015, pages 4832–4837.

Clapp, R. G., Fu, H., and Lindtjorn, O. (2010). Selecting the right hardware for reverse time migration. The Leading Edge, 29(1).

Fletcher, R. P., Du, X., and Fowler, P. J. (2009). Reverse time migration in tilted transversely isotropic (tti) media. Geophysics, 74(6):WCA179–WCA187.

J. Dongarra, H. M. and Strohmaier, E. (2019). Top500 supercomputer: June 2019. https://www.top500.org/lists/2019/06/. [Acesso em: 10 Jul. 2019].

Kukreja, N., Louboutin, M., Vieira, F., Luporini, F., Lange, M., and Gorman, G. (2016) Devito: Automated fast finite difference computation. In Procs. of the 6th Intl. Workshop on Domain-Spec. Lang. and High-Level Frameworks for HPC, WOLFHPC ’16, pages 11–19. IEEE Press.

Lukawski, M. Z., Anderson, B. J., Augustine, C., Capuano Jr, L. E., Beckers, K. F., Livesay, B., and Tester, J. W. (2014). Cost analysis of oil, gas, and geothermal well drilling. Journal of Petroleum Science and Engineering, 118:1–14.

Memeti, S., Li, L., Pllana, S., Kołodziej, J., and Kessler, C. (2017). Benchmarking opencl, openacc, openmp, and cuda: programming productivity, performance, and energy consumption. In Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, pages 1–6. ACM.

Niu, X., Jin, Q., Luk, W., and Weston, S. (2014). A Self-Aware Tuning and SelfAware Evaluation Method for Finite-Difference Applications in Reconfigurable Systems. ACM Trans. on Reconf. Technology and Systems, 7(2). Nvidia (2016). Developer Zone - CUDA Toolkit Documentation.

Ott, R. L. and Longnecker, M. T. (2015). An introduction to statistical methods and data analysis. Nelson Education.

Pavan, P. J., Serpa, M. S., Padoin, E. L., Schnorr, L. M., Navaux, P. O. A., and Panetta, J. (2018). Improving i/o performance of rtm algorithm for oil and gas simulation. In 2018 Symposium on High Performance Computing Systems (WSCAD), pages 270–270. IEEE.

Qutob, H. et al. (2004). Underbalanced drilling

Rubio, F., Farrés, A., Hanzich, M., de la Puente, J., and Ferrer, M. (2013). Optimizing Isotropic and Fully-anisotropic Elastic Modelling on Multi-GPU Platforms. In 75th EAGE Conference & Exhibition, pages 10–13. EAGE.

Sabne, A., Sakdhnagool, P., Lee, S., and Vetter, J. S. (2014). Evaluating performance portability of openacc. In International Workshop on Languages and Compilers for Parallel Computing, pages 51–66. Springer.

Sanders, J. and Kandrot, E. (2010). CUDA by example: an introduction to generalpurpose GPU programming. Addison-Wesley Professional.

Serpa, M. S., Cruz, E. H., Diener, M., Krause, A. M., Navaux, P. O. A., Panetta, J., Farrés, A., Rosas, C., and Hanzich, M. (2019a). Optimization strategies for geophysics models on manycore systems. The International Journal of High Performance Computing Applications, 33(3):473–486.

Serpa, M. S., Moreira, F. B., Navaux, P. O., Cruz, E. H., Diener, M., Griebler, D., and Fernandes, L. G. (2019b). Memory performance and bottlenecks in multicore and gpu architectures. In 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pages 233–236. IEEE.

Slaight, T. (2002). Platform management ipmi controllers, sensors, and tools. In Intel Developer Forum.

Subramaniam, B., Saunders, W., Scogland, T., and Feng, W.-c. (2013). Trends in energyefficient computing: A perspective from the green500. In 2013 International Green Computing Conference Proceedings, pages 1–8. IEEE.

Terpstra, D., Jagode, H., You, H., and Dongarra, J. (2010). Collecting performance data with papi-c. In Tools for High Performance Computing 2009, pages 157–173. Springer.

Wienke, S., Springer, P., Terboven, C., and an Mey, D. (2012). Openacc—first experiences with real-world applications. In European Conference on Parallel Processing, pages 859–870. Springer.

Witten, I. H., Frank, E., Hall, M. A., and Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.

Yuen, D. A., Wang, L., Chi, X., Johnsson, L., Ge, W., and Shi, Y. (2013). GPU solutions to multi-scale problems in science and engineering. Springer.

Zhebel, E., Minisini, S., Kononov, A., and Mulder, W. (2013). Performance and scalability of finite-difference and finite-element wave-propagation modeling on Intel’s Xeon Phi. In Society of Exploration Geophysicists (SEG) Technical Program Expanded Abstracts 2013, pages 3386–3390.
Published
2019-11-08
SERPA, Matheus; PAVAN, Pablo José; PANETTA, Jairo; AZAMBUJA, Antônio; CARISSIMI, Alexandre; NAVAUX, Philippe Olivier. Portabilidade e Eficiência do Método Fletcher de Aplicações Sísmicas em Arquiteturas Multicore e GPU. In: SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS (SSCAD), 20. , 2019, Campo Grande. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 169-180. DOI: https://doi.org/10.5753/wscad.2019.8666.