Um Modelo de Desempenho Multinível para Algoritmos Paralelos Wavefront 2D Clássicos
Abstract
The Classic 2D-wavefront pattern is commonly used to parallelize dynamic programming algorithms, where data elements or cells are distributed on a matrix. Since elements in a diagonal are completely independent, they can be computed in parallel. In modern systems, such as multicore clusters, a multi-level parallelism approach is usually adopted and selecting the size of each task is of critical importance to maximize performance. On one hand, coarse-grained tasks will result in low concurrency, while fine-grained tasks increase overhead. In this work we analyse this trade-off, considering aspects such as the task sizes (in two levels) and memory limitations. As a result of this analysis, we propose a model that predicts which task size configurations would yield optimal performance, based on the aforementioned parameters. To evaluate the model accuracy, we use the classic Longest Common Subsequence problem (LCS), which is an important part of DNA sequence alignment.References
Anvik, J., MacDonald, S., Szafron, D., Schaeffer, J., Bromling, S., and Tan, K. (2002). Generating parallel programs from the wavefront design pattern. In Proceedings International Parallel and Distributed Processing Symposium., IPDPS 2002, pages 8 pp–.
Arora, N. S., Blumofe, R. D., and Plaxton, C. G. (1998). Thread scheduling for multiprogrammed multiprocessors. In Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 119–129, New York, NY, USA. ACM.
Chen, Y., Yu, S., and Leng, M. (2006). Parallel sequence alignment algorithm for clusIn Shangai. Knowledge Enterprise: Intelligent Strategies in Product tering system. Design, Manufacturing, and Management, pages 311–321.
D. A. Benson, M. Cavanaugh, K. C. I. K.-M. D. J. L. J. O. and Sayers, E. W. (2013). Genbank. Nucleic Acids Research, 41:36–42.
Eggleston, H. G. (1967). The isoperimetric problem. In Press, P., editor, Exploring University Mathematics, volume 1, chapter 7. Pergamon Press.
Mohanty, S. and Cole, M. (2007). Autotuning wavefront applications for multicore multigpu hybrid architectures. In Proceedings of Programming Models and Applications on Multicores and Manycores, PMAM'14, pages 1:1–1:9, New York, NY, USA. ACM.
Sena, A. C., Nascimento, A. P., Boeres, C., Rebello, V., and Bulcao, A. (2011). An In E approach to optimise the execution of rtm algorithm in multicore machines. Science (e-Science), 2011 IEEE 7th International Conference on, pages 403–410.
Wagner, R. A. and Fischer, M. J. (1974). The string-to-string correction problem. J. ACM, 21(1):168–173.
Arora, N. S., Blumofe, R. D., and Plaxton, C. G. (1998). Thread scheduling for multiprogrammed multiprocessors. In Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 119–129, New York, NY, USA. ACM.
Chen, Y., Yu, S., and Leng, M. (2006). Parallel sequence alignment algorithm for clusIn Shangai. Knowledge Enterprise: Intelligent Strategies in Product tering system. Design, Manufacturing, and Management, pages 311–321.
D. A. Benson, M. Cavanaugh, K. C. I. K.-M. D. J. L. J. O. and Sayers, E. W. (2013). Genbank. Nucleic Acids Research, 41:36–42.
Eggleston, H. G. (1967). The isoperimetric problem. In Press, P., editor, Exploring University Mathematics, volume 1, chapter 7. Pergamon Press.
Mohanty, S. and Cole, M. (2007). Autotuning wavefront applications for multicore multigpu hybrid architectures. In Proceedings of Programming Models and Applications on Multicores and Manycores, PMAM'14, pages 1:1–1:9, New York, NY, USA. ACM.
Sena, A. C., Nascimento, A. P., Boeres, C., Rebello, V., and Bulcao, A. (2011). An In E approach to optimise the execution of rtm algorithm in multicore machines. Science (e-Science), 2011 IEEE 7th International Conference on, pages 403–410.
Wagner, R. A. and Fischer, M. J. (1974). The string-to-string correction problem. J. ACM, 21(1):168–173.
Published
2016-10-05
How to Cite
LAUREDO, Alexandre; SENA, Alexandre; DE CASTRO, Maria; MARZULO, Leandro; ALVES, Tiago.
Um Modelo de Desempenho Multinível para Algoritmos Paralelos Wavefront 2D Clássicos. In: SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS (SSCAD), 17. , 2016, Aracajú.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2016
.
p. 263-274.
DOI: https://doi.org/10.5753/wscad.2016.14265.