An experience of applying the parallelization regions for the step-by-step parallelization of software packages using the SAPFOR system
Keywords:SAPFOR (System FOR Automated Parallelization), automation of parallelization, parallel computing, DVM (Distributed Virtual Memory), incremental parallelization for clusters
The main difficulty in developing a parallel program for a cluster is the need to make global decisions on the distribution of data and computations, taking into account the properties of the entire program, and then doing the hard work of modifying the program and debugging it. A large amount of code as well as multimoduling, multivariant and multilanguage, make it difficult to make decisions on a consistent distribution of data and computations. The experience of using the previous SAPFOR system showed that, when parallelizing large programs and software packages for a cluster, one should be able to parallelize them gradually, starting with the most time-intensive fragments and gradually adding new fragments until we reach the desired level of parallel program efficiency. For this purpose, the previous system was completely redesigned and a new system SAPFOR (System FOR Automated Parallelization) was created. To solve this problem, the method of incremental or partial arallelization will be considered in this paper. The idea of this method is that not the entire program is subjected to parallelization, but only its parts (parallelization regions) where additional versions of the required data are created and distributed and the corresponding computations are performed. This paper also discusses the application of automated mapping of programs to a cluster using the proposed incremental parallelization method and using the example of a NPB (NAS Parallel Benchmarks) software package.
- NVidia CUDA Zone.
https://developer.nvidia.com/cuda-zone . Cited November 15, 2020.
- A. S. Kolganov, N. A. Kataev, and P. A. Titov, “Automated Parallelization of a Simulation Method of Elastic Wave Propagation in Media with Complex 3D Geometry Surface on High-Performance Heterogeneous Clusters,” Vestn. Ufa Aviatsion. Tekh. Univ. 21 (3), 87-96 (2017).
http://dvm-system.org/. Cited November 15, 2020.
- M. S. Klinov and V. A. Kryukov, “Automatic Parallelization of Fortran Programs. Mapping to Cluster,” Vestn. Lobachevskii Univ. Nizhni Novgorod, No. 2, 128-134 (2009).
- V. A. Bakhtin, O. F. Zhukova, N. A. Kataev, et al., “Automation of Software Package Parallelization,” in Proc. XVIII All-Russian Conference on Scientific Service on the Internet, Novorossiysk, Russia, September 19-24, 2016 (Keldysh Institute of Applied Mathematics, Moscow, 2016), pp. 76-85.
- V. A. Bakhtin, O. F. Zhukova, N. A. Kataev, et al., “Incremental Parallelization for Clusters in the SAPFOR System,” in Proc. XIX All-Russian Conference on Scientific Service on the Internet, Novorossiysk, Russia, September 18-23, 2017 (Keldysh Institute of Applied Mathematics, Moscow, 2017), pp. 48-52.
- A. S. Kolganov and S. V. Yashin, “Automatic Incremental Parallelization of Large Software Systems Using the SAPFOR System,” in Proc. Int. Conf. on Parallel Computing Technologies, Kaliningrad, Russia, April 2-4, 2019 (South Ural State Univ., Chelyabinsk, 2019), pp. 275-287.
- P. Banerjee, J. A. Chandy, M. Gupta, et al., “An Overview of the PARADIGM Compiler for Distributed-Memory Multicomputers,”
http://www.cs.cmu.edu/~745/papers/paradigm.pdf . Cited November 15, 2020.
- BERT77 system: Automatic and Efficient Parallelizer for FORTRAN.
http://www.sai.msu.su/sal/C/3/ BERT_77.html . Cited November 15, 2020.
- ParaWise System.
http://www.parallelsp.com/. Cited November 15, 2020.
- DVMH model.
http://dvm-system.org/static_data/docs/FDVMH-user-guide-ru.pdf . Cited November 15, 2020.
- A. S. Kolganov and N. N. Korolev, “Static Analysis of Private Variables in the System of Automated Parallelization of Fortran Programs,” in Proc. Int. Conf. on Parallel Computing Technologies, Rostov-on-Don, Russia, April 2-6, 2018 (South Ural State Univ., Chelyabinsk, 2018), pp. 286-294.
- NAS Parallel Benchmarks.
https://www.nas.nasa.gov/publications/npb.html . Cited November 15, 2020.
- D. H. Bailey and J. T. Barton, The NAS Kernel Benchmark Program , Report TM-86711 (NASA Ames Research Center, Moffett Field, 1985).
- T. H. Pulliam, Efficient Solution Methods for the Navier-Stokes Equations (Von Kármán Inst. for Fluid Dynamics, Rhode-Saint-Genése, 1986).
- A. Jameson, W. Schmidt, and E. Turkel, “Numerical Solution of the Euler Equations by Finite Volume Methods Using Runge-Kutta Time Stepping Schemes”, AIAA Paper 81-1259 (1981).
- The NAS Parallel Benchmarks.
https://www.nas.nasa.gov/assets/pdf/techreports/1994/rnr-94-007.pdf . Cited November 15, 2020.
- K60 Supercomputer.
https://keldysh.ru/. Cited November 15, 2020.
- NAS Parallel Benchmarks with CUDA.
https://www.tu-chemnitz.de/informatik/PI/sonstiges/downloads/npb-gpu/index.php.en . Cited November 15, 2020.
- NAS Parallel Benchmarks with OpenCL.
http://aces.snu.ac.kr/software/snu-npb/. Cited November 15, 2020.