Analysis and optimization of a $k$-chain reduction algorithm for distributed computer systems


  • M.G. Kurnosov


root reduction
message passing models
parallel programming
distributed computer systems


In the LogP model of parallel computing, an analytical expression of the k-chain algorithm’s execution time is derived. The optimal value of k in the LogP model is found. A new algorithm based on the optimal value of k is developed. For the reduction of root process’s waiting time, an algorithm with an adaptive number of chains is proposed. The dependence of the execution time of the proposed algorithm on the number of processes has a growth rate of Ο(sqrt{P}), which is more efficient compared to the linear running time of the original k-chain algorithm. The proposed algorithms are implemented in the MPI standard and studied on computer clusters with InfiniBand QDR networks.





Section 1. Numerical methods and applications

Author Biography

M.G. Kurnosov


  1. V. G. Khoroshevsky, “Distributed Programmable Structure Computer Systems,” Vestn. Sib. Gos. Univ. Telekommun. Inform. 10 (2), 3-41 (2010).
  2. T. Hoefler and D. Moor, “Energy, Memory, and Runtime Tradeoffs for Implementing Collective Communication Operations,” J. Supercomput. Frontiers Innovations 1 (2), 58-75 (2014).
  3. P. Balaji, D. Buntinas, D. Goodell, et al., “MPI on Millions of Cores,” Parallel Process. Lett. 21 (1), 45-60 (2011).
  4. R. Alverson, D. Roweth, and L. Kaplan, “The Gemini System Interconnect,” in Proc. 18th IEEE Symposium on High Performance Interconnects, Mountain View, USA, August 18-20, 2010 (IEEE Press, Washington, DC, 2010), pp. 83-87.
  5. D. Chen, N. A. Eisley, P. Heidelberger, et al., “The IBM Blue Gene/Q Interconnection Network and Message Unit,” in Proc. 2011 Int. Conf. for High Performance Computing, Networking, Storage and Analysis, Seattle, USA, November 12-18, 2011 (ACM Press, New York, 2011),
    doi 10.1145/2063384.2063419
  6. V. K. Levin, B. N. Chetverushkin, G. S. Elizarov, et al., “Communication Fabric MVS-Express,” Inform. Tekhnol. Vychisl. Sistemy, No. 1, 10-24 (2014).
  7. R. Thakur, R. Rabenseifner, and W. Gropp, “Optimization of Collective Communication Operations in MPICH,” Int. J. High Perform. Comput. Appl. 19 (1), 49-66 (2005).
  8. M. G. Kurnosov, “Algorithms of Transmission-Cyclical Information Exchanges in the Hierarchical Distributed Computing Systems,” Vestn. Komp’yut. Inform. Tekhnol., No. 5, 27-34 (2011).
  9. T. Ma, G. Bosilca, A. Bouteiller, and J. J. Dongarra, “Kernel-Assisted and Topology-Aware MPI Collective Communications on Multiсore/Many-сore Platforms,” J. Parallel Distrib. Comput. 73 (7), 1000-1010 (2013).
  10. G. Fagg, J. Pješivac-Grbović, G. Bosilca, et al., “Flexible Collective Communication Tuning Architecture Applied to Open MPI,” in Proc. of EuroPVM/MPI, Bonn, Germany, September 17-20, 2006 (Forschungszentrum, Julich, 2006), pp. 1-10.
  11. J. Pješivac-Grbović, T. Angskun, G. Bosilca, et al., “Performance Analysis of MPI Collective Operations,” Cluster Comput. 10 (2), 127-143 (2007).
  12. D. Culler, R. Karp, D. Patterson, et al., “LogP: Towards a Realistic Model of Parallel Computation,” ACM SIGPLAN Notices 28 (7), 1-12 (1993).
  13. T. Worsch, R. Reussner, and W. Augustin, “On Benchmarking Collective MPI Operations,” in Lecture Notes in Computer Science (Springer, Heidelberg, 2002), Vol. 2474, pp. 271-279.
  14. M. G. Kurnosov, “MPIPerf: A Toolkit for Benchmarking MPI Libraries,” Vestn. Lobachevskii Univ. Nizhni Novgorod, No. 5/2, 385-391 (2012).