Optimization of modeling the water-oil mixture filtration in elastic porous media for clusters with Xeon Phi accelerators
Authors
-
S.E. Kireev
Keywords:
high-performance computing
parallel programming
program optimization
Intel Xeon Phi accelerators
scalability
Abstract
On the basis of a previously developed program for modeling the multiphase flows in finite-deformed porous media, a new parallel program optimized for clusters with Intel Xeon Phi accelerators is implemented. Several optimization techniques specific for such accelerators are considered and their effect on the program execution time is discussed. A comparison of the symmetric and offload programming models for the accelerators is performed. The parallelization speedup and efficiency are estimated when using various numbers of cluster’s nodes.
Section
Section 1. Numerical methods and applications
References
- Yu. V. Perepechko, E. I. Romenski, and G. V. Reshetova, “Modeling of Multiphase Flows in Finite-Deformed Porous Media,” in Proc. 11th World Congress on Computational Mechanics (WCCM XI), Barcelona, Spain, July 20-25, 2014 (Polytech. Univ. Catalonia, Barcelona, 2014), pp. 4630-4641.
- Yu. V. Perepechko, E. I. Romenskii, G. V. Reshetova, et al., “Nonlinear Acoustics and Filtration Regimes in Porous Media,” in Supercomputing Technologies in Science, Education, and Industry , Ed. by V. A. Sadovnichii, G. I. Savin, and Vl. V. Voevodin (Mosk. Gos. Univ., Moscow, 2013), pp. 119-126.
- E. Saule, K. Kaya, and Ü. V. Çatalyürek, “Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi,” in Lecture Notes in Computer Science (Springer, Heidelberg, 2014), Vol. 8384, pp. 559-570.
- G. Teodoro, T. Kurc, J. Kong, et al., “Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU: A Case Study from Microscopy Image Analysis,” IEEE Trans. Parallel Distrib. Syst. (2014).
doi 10.1109/IPDPS.2014.111
- O. Kaczmarek, C. Schmidt, P. Steinbrecher, and M. Wagner, Conjugate Gradient Solvers on Intel Xeon Phi and NVIDIA GPUs , arXiv preprint: 1411.4439v1 [physics.comp-ph] (Cornell Univ. Library, Ithaca, 2014),
http://arxiv.org/abs/1411.4439/. Cited February 7, 2015.
- E. A. Golovina, A. S. Semenov, and A. S. Frolov, “Performance Evaluation of Breadth-First Search on Intel Xeon Phi,” Vychisl. Metody Programm. 15, 49-58 (2014).
- S. A. Mirsoleimani, A. Plaat, J. Vermaseren, and J. Van den Herik, Performance Analysis of a 240 Thread Tournament Level MCTS Go Program on the Intel Xeon Phi , arXiv preprint: arXiv: 1409.4297v2 [cs.PF] (Cornell Univ. Library, Ithaca, 2014),
http://arxiv.org/abs/1409.4297 . Cited February 7, 2015.
- I. A. Surmin, S. I. Bastrakov, A. A. Gonoskov, et al., “Particle-in-Cell Plasma Simulation Using Intel Xeon Phi Coprocessors,” Vychisl. Metody Programm. 15, 530-536 (2014).
- S. Heybrock, B. Joó, D. D. Kalamkar, et al., Lattice QCD with Domain Decomposition on Intel Xeon Phi Co-Processors , arXiv preprint: arXiv: 1412.2629v1 [hep-lat] (Cornell Univ. Library, Ithaca, 2014),
http://arxiv.org/abs/1412.2629 . Cited February 7, 2015.
- K. Murano, T. Shimobaba, A. Sugiyama, et al., “Fast Computation of Computer-Generated Hologram Using Xeon Phi Coprocessor,” Comput. Phys. Commun. 185 (10), 2742-2757 (2014).
- Yu. A. Klimov, A. Yu. Orlov, and A. B. Shvorin, “Software Toolkit for Implementing Stencil Codes on Hybrid Supercomputers,” Program. Sistemy: Teor. Prilozh. 3 (2), 23-49 (2012).
- V. N. Dorovsky, E. I. Romensky, A. I. Fedorov, and Yu. V. Perepechko, “A Resonance Method for Measuring Permeability,” Geol. Geophys. 52 (7), 950-961 (2011) [Russ. Geol. Geophys. 52 (7), 745-752 (2011)].
- J. Jeffers and J. Reinders, Intel Xeon Phi Coprocessor High-Performance Programming (Morgan Kaufmann, Waltham, 2013).
- A. Fog, “Optimizing Software in C++: An Optimization Guide for Windows, Linux and Mac Platforms,” 2014.
http://www.agner.org/optimize/optimizing_cpp.pdf . Cited February 7, 2015.