The practice of conducting performance analysis of supercomputer applications
Authors
-
I.V. Afanasyev
-
V.V. Voevodin
-
V.Yu. Rudyak
-
A.V. Emelyanenko
Keywords:
high-performance computing
supercomputers
efficiency analysis
graphics accelerators
liquid crystals
elastic continuum theory
Abstract
A method for the efficiency analysis and optimization of supercomputer applications applied earlier in practice to study jobs of a user on the Lomonosov-2 supercomputer is proposed. This method involves various stages of the jobs research, starting from studying the general behavior of all user launches on a supercomputer and ending with a detailed study and optimization of the source code of a selected program. The paper describes the general stages of the analysis that were carried out in practice, shows performance metrics that should be paid attention to when performing such an analysis, and shows also some specific examples of the job behavior and the effect of optimization carried out for the task of calculating liquid crystal droplets.
Section
Section 1. Numerical methods and applications
References
- V. Voevodin and V. Voevodin, “Efficiency of Exascale Supercomputer Centers and Supercomputing Education,” in High Performance Computer Applications (Springer, Cham, 2016), Vol. 595, pp. 14-23.
- J. Vetter and C. Chambreau, “MpiP: Lightweight, Scalable MPI Profiling,”
http://mpip.sourceforge.net . Cited August 27, 2019.
- Intel VTune Amplifier documentation.
https://software.intel.com/en-us/vtune . Cited August 27, 2019.
- N. Nethercote and J. Seward, “Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation,” SIGPLAN Not. 42 (6), 89-100 (2007).
- P. Shvets, Vad. Voevodin, and S. Zhumatiy, “HPC Software for Massive Analysis of the Parallel Efficiency of Applications,” in Communications in Computer and Information Science (Springer, Cham, 2019), Vol. 1063, pp. 3-18.
- Vl. V. Voevodin, A. S. Antonov, D. A. Nikitenko, et al., “Supercomputer Lomonosov-2: Large Scale, Deep Monitoring and Fine Analytics for the User Community,” Supercomput. Front. Innov. 6 (2), 4-11 (2019).
- R. D. Groot and P. B. Warren, “Dissipative Particle Dynamics: Bridging the Gap between Atomistic and Mesoscopic Simulation,” J. Chem. Phys. 107 (11), 4423-4435 (1997).
- S. Plimpton, “Fast Parallel Algorithms for Short-Range Molecular Dynamics,” J. Comput. Phys. 117 (1), 1-19 (1995).
- F. C. Frank, “I. Liquid Crystals. On the Theory of Liquid Crystals,” Discuss. Faraday Soc. 25, 19-28 (1958).
- V. Y. Rudyak, A. V. Emelyanenko, and V. A. Loiko, “Structure Transitions in Oblate Nematic Droplets,” Phys. Rev. E 88 (2013).
doi 10.1103/PhysRevE.88.052501
- Profiler User’s Guide.
https://docs.nvidia.com/cuda/profiler-users-guide/index.html{#}nvprof-overview . Cited August 27, 2019.
- C. Yang and S. Williams, “Performance Analysis of GPU-Accelerated Applications Using the Roofline Model,”
https://developer.nvidia.com/gtc/2019/video/S9624 . Cited August 27, 2019.
- D. A. Nikitenko, V. V. Voevodin, and S. A. Zhumatiy, “Octoshell: Large Supercomputer Complex Administration System,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 5 (3), 76-95 (2016).
- D. Shaykhislamov and V. Voevodin, “An Approach for Dynamic Detection of Inefficient Supercomputer Applications,” Procedia Comput. Sci. 136, 35-43 (2018).