On accelerating transformations of programs for solving the generalized Dirichlet problem
Authors
-
Elena A. Metelitsa
-
Boris Ya. Steinberg
Keywords:
tiling
wavefront
parallelization
innermost loop
high performance computing
generalized Dirichlet problem
Abstract
The chain of transformations in the program implementation of the Gauss–Seidel algorithm for solving the generalized two-dimensional Dirichlet problem of the Poisson equation is considered in this paper. It complements the previous chain of accelerating (in particular, parallelizing) transformations of this program. The previous chain of transformations contained “skewing”, “tiling”, “hyperplane method” and “parallelization”. In this work, it is supplemented with the transformations “removal of general subexpressions”, “removal of loop invariants”, “optimization of the loop header”, “optimization of the calculation of array pointers”. A series of numerical experiments were carried out with the resulting chain of transformations on a computer with an eight-core processor. Experiments were performed for different tile sizes. The greatest obtained acceleration is 24%. An analysis of the obtained accelerations was conducted.
Section
Parallel software tools and technologies
Author Biographies
Boris Ya. Steinberg
Southern Federal University,
Vorovich Institute for Mathematics, Mechanics and Computer Science,
• Senior Researcher, Head of the Department of Algebra and Discrete Mathematics
References
- E. A. Metelitsa, “Justification of Methods for Accelerating Iterative Loops Nests,” Program Systems: Theory and Applications, 15 (1), 63-94 (2024).
doi 10.25209/2079-3316-2024-15-1-63-94
- A. P. Bagliy, E. A. Metelitsa, and B. Ya. Steinberg, “Automatic Parallelization of Iterative Loops Nests on Distributed Memory Computing Systems,” in Lecture Notes in Computer Science (Springer, Cham, 2023), Vol. 14098, pp. 18-29.
doi 10.1007/978-3-031-41673-6_2
- U. Bondhugula, M. Baskaran, S. Krishnamoorthy, et al., “Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model,” in Lecture Notes in Computer Science (Springer, Berlin, 2008), Vol. 4959, pp. 132-146.
doi 10.1007/978-3-540-78791-4_9
- E. D. Kotina, “On Convergence of Block Iterative Methods,” Izv. Irkutsk Gos. Univ. Ser. Matem. 5 (3), 41-55 (2012).
- Z. Gong, Z. Chen, J. Szaday, et al., “An Empirical Study of the Effect of Source-Level Loop Transformations on Compiler Stability,” Proc. ACM Program. Lang. 2 (OOPSLA), Article No. 126 (2018).
doi 10.1145/3276496
- A. Vasilenko, V. Veselovskiy, E. Metelitsa, et al., “Precompiler for the ACELAN-COMPOS Package Solvers,” in Lecture Notes in Computer Science (Springer, Cham, 2021), Vol. 12942, pp. 103-116.
doi 10.1007/978-3-030-86359-3_8
- Optimizing Parallelizing System.
http://www.ops.rsu.ru/.Cited August 9, 2024.
- R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures (Morgan Kaufmann, San Francisco, 2002).
- S. S. Muchnick, Advanced Compiler Design Implementation (Morgan Kaufmann, San Francisco, 1997).
- LLVM’s Analysis and Transform Passes.
https://opensource.apple.com/source/lldb/lldb-76/llvm/docs/Passes.html . Cited August 9, 2024.
- M. Wolfe, “Loop Skewing: The Wavefront Method Revisited,” Int. J. Parallel Prog. 15 (4), 279-293 (1986).
doi 10.1007/BF01407876
- F. Irigoin and R. Triolet, “Supernode Partitioning,” in Proc. 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Diego, USA, January 10-13, 1988
https://dl.acm.org/doi/pdf/10.1145/73560.73588 . Cited August 10, 2024.
doi 10.1145/73560.73588
- M. E. Wolf and M. S. Lam, “A Loop Transformation Theory and an Algorithm to Maximize Parallelism,” IEEE Trans. Parallel Distrib. Syst. 2 (4), 452-471 (1991).
doi 10.1109/71.97902
- L. Lamport, “The Parallel Execution of DO Loops,” Commun. ACM. 17 (2), 83-93 (1974).
doi 10.1145/360827.360844
- S. L. Graham, M. Snir, and C. A. Patterson (Eds.). Getting up to Speed: The Future of Supercomputing (National Academies Press, Washington, D.C., 2005).
doi 10.17226/11148