DOI: https://doi.org/10.26089/NumMet.v25r212

Modified method large sparse unstructured matrices processing on reconfigurable computing systems

Authors

  • Ilya I. Levin
  • Aleksandr V. Podoprigora

Keywords:

reconfigurable computing systems
high-performance computing systems
sparse matrix
large unstructured matrix
sparse matrix format
discrete-event transformation
intensity balance dataflow
parallelization of calculations
parallelization over non-zero elements

Abstract

When processing high-dimensional matrices with an irregular structure, the realperformance of cluster multiprocessor computing systems (MCS) is low and even with the use of special processing methods does not exceed 30%. To effectively process large matrices with an irregular structure, it is possible to use reconfigurable computing systems (RCS), for which the authors proposed a method for processing large sparse unstructured (LSU) matrices, due to which real performance was achieved close to 50% of the peak. The article describes a modification of the developed method for processing LSU matrices, which is characterized by parallel processing of non-zero row elements and allows doubling the speed of the computing structure with a slight increase in the occupied hardware resource. The modified method of processing LSU matrices on an RCS provides real performance close to 90% of the peak, which significantly exceeds the known results of solving similar problems for cluster MCS. Comparison of the results of solving the problem of ranking web pages using the PageRank algorithm obtained on the “Arcturus” RCS and the Fugaku supercomputer, as well as the results of solving the SLAE using the Jacobi method on the “Arcturus” RCS and the graphics accelerator NVidia Tesla K40 confirms the theoretical conclusions.


Published

2024-04-26

Issue

Section

Parallel software tools and technologies

Author Biographies

Ilya I. Levin

South Federal University,
Institute of Computer Technology and Information Security,
Department of Intelligent and Multiprocessor Systems
• Head of Department

Aleksandr V. Podoprigora


References

  1. I. P. Jones and C. P. Thompson, “A Note on the Use of Non-Uniform Grids in Finite Difference Calculations in Boundary and Interior Layers,” in Computational and Asymptotic Methods (Boole Press, Dublin, 1980), pp. 332-341.
  2. I. S. Duff, R. G. Grimes, and J. G. Lewis, Users’ Guide for the Harwell-Boeing Sparse Matrix Collection (Release 1) , Report Number RAL-92-086 (Rutherford Appleton Laboratory, Chilton, United Kingdom, !992).
    https://api.semanticscholar.org/CorpusID: 56998202 . Cited April 12, 2024.
  3. R. C. Burchett, H. H. Happ, and D. R. Vierath, “Reactive Power Planning of Large-Scale Systems,”
    https://api.semanticscholar.org/CorpusID: 107515440 . Cited April 12, 2024.
  4. T. Davis, Sparse Matrix Database. Williams Group Matrix - Webbase-1M
    https://sparse.tamu.edu/Williams/webbase-1M . Cited April 12, 2024.
  5. L. Georgopoulos, A. Sobczyk, D. Christofidellis, et al., Enhancing Multi-Threaded Sparse Matrix Multiplication for Knowledge Graph Oriented Algorithms and Analytics.
    https://dominoweb.draco.res.ibm.com/reports/RZ3953.pdf . Cited April 12, 2024.
  6. G. W. Somers, Acceleration of Block-Aware Matrix Factorization on Heterogeneous Platforms , Master’s Thesis in Electrical and Computer Engineering (University of Ottawa, Ottawa, 2016).
  7. C. Yang, A. Buluç, and J. D. Owens, “Design Principles for Sparse Matrix Multiplication on the GPU,” in Proc. Int. European Conference on Parallel and Distributed Computing, Torino, Italy, August 27-31, 2018.
    https://people.eecs.berkeley.edu/ aydin/Yang2018EuroPar.pdf . Cited April 12, 2024.
  8. A. V. Kalyaev and I. I. Levin, Modular Scalable Multiprocessor Systems with Structural and Procedural Organization of Calculations (Janus-K, Moscow, 2003) [in Russian].
  9. I. A. Kalyaev, I. I. Levin, E. A. Semernikov, and V. I. Shmoilov, Reconfigurable Multi-Pipeline Computing Structures (Ross. Akad. Nauk, Rostov-on-Don, 2008) [in Russian].
  10. I. I. Levin, A. I. Dordopulo, A. M. Fedorov, and Yu. I. Doronchenko, “Development of Technology for Constructing Reconfigurable Computer Systems with Liquid Cooling,” in Proc. 5th All-Russian Scientific and Technical Conference on Supercomputer Technologies, Divnomorskoe, Russia, September 17-22, 2018 (Southern Federal Univ., Rostov-on-Don, 2018), Vol. 1, pp. 184-187 [in Russian].
  11. R. G. Grimes, D. R. Kincaid, and D. M. Young, ITPACK 2.0 User’s Guide , Technical Report Number CNA-150 (Center for Numerical Analysis, University of Texas, 1979).
  12. Y. Saad, SPARSKIT: a Basic Tool Kit for Sparse Computations (Version 2) , Technical Report (University of Minnesota, Minneapolis, 1994).
  13. I. I. Levin, A. I. Dordopulo, I. A. Kalyaev, and V. A. Gudkov, “High-Performance Reconfigurable Computer Systems on the Base of Virtex-7 FPGAs,” Software Engineering, No. 6, 3-7 (2014).
    http://novtex.ru/prin/eng/10.17587/prin._6_2014_1.html . Cited April 12, 2024.
  14. A. V. Podoprigora, “Discrete-Event Method Computations Organizing for Processing Large Sparse Unstructured Matrices on RCS,” Izvestiya SFedU. Engineering Sciences, No. 7, 189-197 (2021).
    https://izv-tn.tti.sfedu.ru/index.php/izv_tn/article/view/590 . Cited April 12, 2024.
  15. I. I. Levin and A. V. Podoprigora, “Method of Parallelization on Basic Macro Operations for Processing Large Sparse Unstructured Matrices on RCS,” Izvestiya SFedU. Engineering Sciences, No. 6, 72-83 (2022).
    https://izv-tn.tti.sfedu.ru/index.php/izv_tn/article/view/725 . Cited April 12, 2024.
  16. A. G. Kovalenko, “Macropipeline Implementation of Authentication Algorithms on High-Level Language COLAMO,” Izvestiya SFedU. Engineering Sciences, No. 4, 210-215 (2012).
  17. L. Page, S. Brin, R. Motwani, and T. Winograd, The PageRank Citation Ranking: Bringing Order to the Web , Technical Report Number SIDL-WP-1999-0120, (Stanford University, Stanford, 1999).
  18. M. Vandromme, J. Gurhem, M. Tsuji, et al., “Scaling the PageRank Algorithm for Very Large Graphs on the Fugaku Supercomputer,” in Lecture Notes in Computer Science (Springer, Cham, 2022), Vol. 13350, pp. 389-402.
    https://link.springer.com/chapter/10.1007/978-3-031-08751-6_28 . Cited April 12, 2024.
    doi 10.1007/978-3-031-08751-6_28
  19. E. Chow, H. Anzt, J. Scott, and J. Dongarra, “Using Jacobi Iterations and Blocking for Solving Sparse Triangular Systems in Incomplete Factorization Preconditioning,” J. Parallel Distr. Comput. 119, 219-230 (2018).
    doi 10.1016/j.jpdc.2018.04.017
  20. NVIDIA Corporation. Whitepaper NVIDIA’s Next Generation CUDA Compute Architecture: Kepler GK110/210. 2014.
    https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/tesla-product-literature/NVIDIA-Kepler-GK110-GK210-Architecture-Whitepaper.pdf . Cited April 12, 2024.
  21. S. P. Kolodziej, M. Aznaveh, M. Bullock, et al., “The SuiteSparse Matrix Collection Website Interface,” J. Open Source Softw. No. 4, Article Number 1244 (2019).
    doi 10.21105/joss.01244.Cited April 12, 2024
  22. D. A. Sorokin, A. V. Kasarkin, and A. V. Podoprigora, “Elements of a Digital Photonic Computer,” Supercomput. Front. Innov. 10 (2), 62-76 (2023).
    doi 10.14529/jsfi2302.Cited April 12, 2024