Additional parallelization of existing MPI programs using SAPFOR

Authors

DOI:

https://doi.org/10.26089/NumMet.v22r415

Keywords:

SAPFOR, DVMH, MPI, automation of parallelization, additional parallelization, accelerators, heterogeneous clusters

Abstract

The SAPFOR and DVM systems are primarily designed to simplify the development of parallel programs of scientific-technical calculations. SAPFOR is a software development suite that aims to produce a parallel version of a sequential program in a semi-automatic way. The fully automatic parallelization is also possible if the program is well-formed and satisfies certain requirements. SAPFOR uses the DVMH directive-based programming model to expose parallelism in the code. The DVMH model introduces CDVMH and Fortran-DVMH (FDVMH) programming languages which extend the standard C and Fortran languages by parallelism specifications. We present MPI-aware extension of the SAPFOR system that exploits opportunities provided by the new features of the DVMH model to extend existing MPI programs with intra-node parallelism. In that way, our approach reduces the cost of parallel program maintainability and allows an MPI program to utilize accelerators and multicore processors. SAPFOR extension has been implemented for both Fortran and C programming languages. In this paper, we use the NAS Parallel Benchmarks to evaluate the performance of generated programs.

Author Biographies

Nikita A. Kataev

Alexander S. Kolganov

References

  1. J. Ragan-Kelley, C. Barnes, A. Adams, et al., “Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines,” SIGPLAN Not. 48 (6), 519-530 (2013). doi 10.1145/2499370.2462176
  2. U. Beaugnon, A. Kravets, S. van Haastregt, et al., “Vobla: A Vehicle for Optimized Basic Linear Algebra,” SIGPLAN Not. 49 (5), 115-124 (2014). doi 10.1145/2666357.2597818
  3. Y. Zhang, M. Yang, R. Baghdadi, et al., “Graphit: A High-Performance Graph DSL,” Proc. of the ACM on Program. Lang. 2,  (OOPSLA) (2018).doi 10.1145/3276491
  4. P. An, A. Jula, S. Rus, et al., “STAPL: An Adaptive, Generic Parallel C++ library,” in {Lecture Notes in Computer Science} (Springer, Heidelberg, 2003), Vol. 2624, pp. 193-208.
    https://doi.org/10.1007/3-540-35767-X_13 . Cited October 15, 2021.
  5. N. Bell and J. Hoberock, “Thrust: A Productivity-Oriented Library for CUDA,” GPU Computing Gems Jade Edition (2012), pp. 359-371.
    https://doi.org/10.1016/B978-0-12-385963-1.00026-5 . Cited October 15, 2021.
  6. OpenMP Compilers & Tools.
    https://www.openmp.org/resources/openmp-compilers-tools/. Cited October 15, 2021.
  7. N. A. Konovalov, V. A. Krukov, S. N. Mikhajlov, and A. A. Pogrebtsov, “Fortan DVM: A Language for Portable Parallel Program Development,” Program. Comput. Softw. 21 (1), 35-38 (1995).
  8. V. A. Bakhtin, M. S. Klinov, V. A. Krukov, et al., “Extension of the DVM-Model of Parallel Programming for Clusters with Heterogeneous Nodes,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Mat. Model. Programm. No. 12, 82-92 (2012).
  9. M. S. Klinov and V. A. Kryukov, “Automatic Parallelization of Fortran Programs. Mapping to Cluster,” Vestn. Lobachevskii Univ. Nizhni Novgorod, No. 2, 128-134 (2009).
  10. V. A. Bakhtin, I. G. Borodich, N. A. Kataev, et al., “Dialogue with a Programmer in the Automatic Parallelization Environment SAPFOR,” Vestn. Lobachevskii Univ. Nizhni Novgorod, No. 5(2), 242-245 (2012).
  11. N. Kataev, “LLVM Based Parallelization of C Programs for GPU,” in Communications in Computer and Information Science (Springer, Cham, 2020), Vol. 1331, pp. 436-448.
    https://doi.org/10.1007/978-3-030-64616-5_38 . Cited October 15, 2021.
  12. N. Kataev, “Interactive Parallelization of C Programs in SAPFOR,” CEUR Workshop Proc., Vol. 2784 (2020), pp. 139-148.
    http://ceur-ws.org/Vol-2784/. Cited October 15, 2021.
  13. N. Kataev, “Application of the LLVM Compiler Infrastructure to the Program Analysis in SAPFOR,” in Communications in Computer and Information Science (Springer, Cham, 2019), Vol. 965, pp. 487-499.doi 10.1007/978-3-030-05807-4_41.
  14. N. Kataev, A. Smirnov, and A. Zhukov, “Dynamic Data-Dependence Analysis in SAPFOR,” CEUR Workshop Proc. Vol. 2543 (2020), pp. 199-208.
    http://ceur-ws.org/Vol-2543/. Cited October 15, 2021.
  15. NAS Parallel Benchmarks.
    https://www.nas.nasa.gov/publications/npb.html . Cited October 15, 2021.
  16. M. Wolfe, High Performance Compilers for Parallel Computing (Addison-Wesley, New York, 1995).
  17. U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, “A Practical Automatic Polyhedral Parallelizer and Locality Optimizer,” SIGPLAN Not. 43 (6), 101-113 (2008). doi 10.1145/1379022.1375595
  18. S. Verdoolaege, J. C. Juega, A. Cohen, et al., “Polyhedral Parallel Code Generation for CUDA,” ACM Trans. Archit. Code Optim. 9 (4), 1-23 (2013). doi 10.1145/2400682.2400713.
  19. T. Grosser, A. Groesslinger, and C. Lengauer, “Polly -- Performing Polyhedral Optimizations on a Low-Level Intermediate Representation,” Parallel Process. Lett. 22 (2012). doi 10.1142/S0129626412500107
  20. T. Grosser and T. Hoefler, “Polly-ACC Transparent Compilation to Heterogeneous Hardware,” in Proc. Int. Conf. on Supercomputing, Istambul, Turkey, June 1-3, 2016 (ACM Press, New York, 2016), doi 10.1145/2925426.2926286.
  21. J. M. Caamano, A. Sukumaran-Rajam, A. Baloian, et al., “APOLLO: Automatic Speculative POLyhedral Loop Optimizer,” in Proc. 7th Int. Workshop on Polyhedral Compilation Techniques (IMPACT 2017), Stockholm, Sweden, January 23, 2017 ,
    https://www.researchgate.net/publication/313059456_APOLLO_Automatic_speculative_POLyhedral_Loop_Optimizer . Cited October 15, 2021.
  22. C. Lattner and V. Adve, “LLVM: A Compilation Framework for Lifelong Program Analysis &Transformation,” in Proc. Int. Symp. on Code Generation and Optimization San Jose (Palo Alto), USA, March 20-24, 2004 , doi 10.1109/CGO.2004.1281665,
    https://llvm.org/pubs/2004-01-30-CGO-LLVM.pdf . Cited October 15, 2021.
  23. J. Doerfert, K. Streit, S. Hack, and Z. Benaissa, “Polly’s Polyhedral Scheduling in the Presence of Reductions,” in Proc. 5th Int. Workshop on Polyhedral Compilation Techniques (IMPACT 2015), Amsterdam, The Netherlands, January 19, 2015 ,
    https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1054.1804&rep=rep1&type=pdf . Cited October 15, 2021.
  24. V. Bakhtin, A. Kolganov, V. Krukov, et al.,
  25. V. A. Bakhtin, D. A. Zakharov, V. A. Krukov, et al., “Additional parallelization of MPI programs using DVM-system,” in Scientific service & Internet: proceedings of the 22nd All-Russian Scientific Conference, Moscow, Russia, September 21-25, 2020, (Keldysh Institute of Applied Mathematics, Moscow, 2020), pp. 80-100.
    doi 10.20948/abrau-2020-29.
  26. Heterogeneous cluster K60.
    https://www.kiam.ru/MVS/resourses/k60.html . Cited October 15, 2021.

Published

03-11-2021

How to Cite

Катаев Н.А., Колганов А.С. Additional Parallelization of Existing MPI Programs Using SAPFOR // Numerical Methods and Programming (Vychislitel’nye Metody i Programmirovanie). 2021. 22. 239-251. doi 10.26089/NumMet.v22r415

Issue

Section

Parallel software tools and technologies