Towards multicluster computations with Julia

  • Francisco H. de Carvalho Junior UFC
  • Tiago Carneiro Interuniversity Microelectronics Centre

Resumo


The ability to aggregate the computational resources of multiple clusters is useful for solving large problems that can benefit from multicluster platforms. In this context, this paper introduces an extension to Julia that enables multilevel parallelism and allows users to exploit use cases of multicluster computation. The proposal is evaluated through a proof-of-concept case study based on the multizone version of the NAS Parallel Benchmarks (NPB-MZ), focusing on evaluating inter-cluster communication and load-balancing overheads.

Referências

Abawajy, J. H. and Dandamudi, S. P. (2003). Parallel Job Scheduling on Multicluster Computing System. In IEEE Intern. Conference on Cluster Computing, pages 11–18.

Bailey, D. H. and et al. (1991). The NAS Parallel Benchmarks. International Journal of Supercomputing Applications, 5(3):63–73.

Bezanson, J., Chen, J., Chung, B., Karpinski, S., Shah, V. B., Vitek, J., and Zoubritzky, L. (2018). Julia: Dynamism and Performance Reconciled by Design. ACM Programming Languages, 2(OOPSLA).

Bolze, R., Cappello, F., Caron, E., Daydé, M., Desprez, F., Jeannot, E., Jégou, Y., Lanteri, S., Leduc, J., and Melab, N. e. a. (2006). Grid’5000: a large scale and highly reconfigurable experimental grid testbed. The International Journal of High Performance Computing Applications, 20(4):481–494.

Ciccozzi, F., Addazi, L., Asadollah, S. A., Lisper, B., Masud, A. N., and Mubeen, S. (2022). A Comprehensive Exploration of Languages for Parallel Computing. 55(2).

De Carvalho Junior, F. H. and Carneiro, T. (2023). A Component Model for Multilevel Parallel Programming. In XXVII Brazilian Symposium on Programming Languages, SBLP’23, page 25–32, New York, NY, USA. Association for Computer Machinery.

Dongarra, J., Otto, S. W., Snir, M., and Walker, D. (1996). A Message Passing Standard for MPP and Workstation. Communications of ACM, 39(7):84–90.

Foster, I. and Kesselman, C. (2004). The Grid 2: Blueprint for a New Computing Infrastructure. M. Kauffman.

Jayalath, C., Stephen, J., and Eugster, P. (2014). From the Cloud to the Atmosphere: Running MapReduce across Data Centers. IEEE Trans. on Computers, 63(1):74–87.

Jin, H. and Van der Wijngaart, R. F. (2006). Performance characteristics of the multi-zone NAS parallel benchmarks. Journal of Parallel and Distributed Computing, 66(5):674–685. IPDPS’04 Special Issue.

Nardelli, F. Z., Belyakova, J., Pelenitsyn, A., Chung, B., Bezanson, J., and Vitek, J. (2018). Julia Subtyping: A Rational Reconstruction. Proceedings of the ACM Programming Languages, 2.

Pelenitsyn, A., Belyakova, J., Chung, B., Tate, R., and Vitek, J. (2021). Type Stability in Julia: Avoiding Performance Pathologies in JIT Compilation. ACM Programmming Languages, 5(OOPSLA).

Wang, L., Tao, J., Marten, H., Streit, A., Khan, S. U., Kolodziej, J., and Chen, D. (2012). MapReduce Across Distributed Clusters for Data-intensive Applications. In 26th Intern. Parallel and Distributed Processing Symposium, pages 2004–2011.

Wu, D., Sakr, S., Zhu, L., and Wu, H. (2017). Towards big data analytics across multiple clusters. In 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid ’17, page 218–227. IEEE Press.
Publicado
23/10/2024
CARVALHO JUNIOR, Francisco H. de; CARNEIRO, Tiago. Towards multicluster computations with Julia. In: SIMPÓSIO EM SISTEMAS COMPUTACIONAIS DE ALTO DESEMPENHO (SSCAD), 25. , 2024, São Carlos/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 276-287. DOI: https://doi.org/10.5753/sscad.2024.244307.