Об ускоряющих преобразованиях программ для решения обобщенной задачи Дирихле

Elena A. Metelitsa; Boris Ya. Steinberg

doi:10.26089/NumMet.v25r322

DOI: https://doi.org/10.26089/NumMet.v25r322

On accelerating transformations of programs for solving the generalized Dirichlet problem

Authors

Elena A. Metelitsa
Boris Ya. Steinberg

Keywords:

tiling

wavefront

parallelization

innermost loop

high performance computing

generalized Dirichlet problem

Abstract

The chain of transformations in the program implementation of the Gauss–Seidel algorithm for solving the generalized two-dimensional Dirichlet problem of the Poisson equation is considered in this paper. It complements the previous chain of accelerating (in particular, parallelizing) transformations of this program. The previous chain of transformations contained “skewing”, “tiling”, “hyperplane method” and “parallelization”. In this work, it is supplemented with the transformations “removal of general subexpressions”, “removal of loop invariants”, “optimization of the loop header”, “optimization of the calculation of array pointers”. A series of numerical experiments were carried out with the resulting chain of transformations on a computer with an eight-core processor. Experiments were performed for different tile sizes. The greatest obtained acceleration is 24%. An analysis of the obtained accelerations was conducted.

Downloads

PDF (Русский)

Published

2024-09-04

Issue

Vol. 25 (2024): Issue 3.

Section

Parallel software tools and technologies

Author Biographies

Elena A. Metelitsa

Southern Federal University,
Vorovich Institute for Mathematics, Mechanics and Computer Science,
• Junior Researcher

Boris Ya. Steinberg

Southern Federal University,
Vorovich Institute for Mathematics, Mechanics and Computer Science,
• Senior Researcher, Head of the Department of Algebra and Discrete Mathematics

References

E. A. Metelitsa, “Justification of Methods for Accelerating Iterative Loops Nests,” Program Systems: Theory and Applications, 15 (1), 63-94 (2024).
doi 10.25209/2079-3316-2024-15-1-63-94
A. P. Bagliy, E. A. Metelitsa, and B. Ya. Steinberg, “Automatic Parallelization of Iterative Loops Nests on Distributed Memory Computing Systems,” in Lecture Notes in Computer Science (Springer, Cham, 2023), Vol. 14098, pp. 18-29.
doi 10.1007/978-3-031-41673-6_2
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, et al., “Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model,” in Lecture Notes in Computer Science (Springer, Berlin, 2008), Vol. 4959, pp. 132-146.
doi 10.1007/978-3-540-78791-4_9
E. D. Kotina, “On Convergence of Block Iterative Methods,” Izv. Irkutsk Gos. Univ. Ser. Matem. 5 (3), 41-55 (2012).
Z. Gong, Z. Chen, J. Szaday, et al., “An Empirical Study of the Effect of Source-Level Loop Transformations on Compiler Stability,” Proc. ACM Program. Lang. 2 (OOPSLA), Article No. 126 (2018).
doi 10.1145/3276496
A. Vasilenko, V. Veselovskiy, E. Metelitsa, et al., “Precompiler for the ACELAN-COMPOS Package Solvers,” in Lecture Notes in Computer Science (Springer, Cham, 2021), Vol. 12942, pp. 103-116.
doi 10.1007/978-3-030-86359-3_8
Optimizing Parallelizing System.
http://www.ops.rsu.ru/.Cited August 9, 2024.
R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures (Morgan Kaufmann, San Francisco, 2002).
S. S. Muchnick, Advanced Compiler Design Implementation (Morgan Kaufmann, San Francisco, 1997).
LLVM’s Analysis and Transform Passes.
https://opensource.apple.com/source/lldb/lldb-76/llvm/docs/Passes.html . Cited August 9, 2024.
M. Wolfe, “Loop Skewing: The Wavefront Method Revisited,” Int. J. Parallel Prog. 15 (4), 279-293 (1986).
doi 10.1007/BF01407876
F. Irigoin and R. Triolet, “Supernode Partitioning,” in Proc. 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Diego, USA, January 10-13, 1988
https://dl.acm.org/doi/pdf/10.1145/73560.73588 . Cited August 10, 2024.
doi 10.1145/73560.73588
M. E. Wolf and M. S. Lam, “A Loop Transformation Theory and an Algorithm to Maximize Parallelism,” IEEE Trans. Parallel Distrib. Syst. 2 (4), 452-471 (1991).
doi 10.1109/71.97902
L. Lamport, “The Parallel Execution of DO Loops,” Commun. ACM. 17 (2), 83-93 (1974).
doi 10.1145/360827.360844
S. L. Graham, M. Snir, and C. A. Patterson (Eds.). Getting up to Speed: The Future of Supercomputing (National Academies Press, Washington, D.C., 2005).
doi 10.17226/11148

License

This work is licensed under a Creative Commons Attribution 4.0 International License.