DOI: https://doi.org/10.26089/NumMet.v21r325

Developing a prototype of high-performance graph-processing framework for NEC SX–Aurora TSUBASA vector architecture

Authors

  • I.V. Afanasyev

Keywords:

NEC SX–Aurora TSUBASA
vector architectures
graph algorithms
graph framework
graph API
finding shortest paths in a graph
breadth-first search

Abstract

This article describes a prototype of graph-processing framework VGL (Vector Graph Library), aimed at the efficient implementation of graph algorithms for the modern NEC SX–Aurora TSUBASA vector architecture. Present day vector systems can significantly speed up various memory-intensive applications, including graph algorithms. However, approaches to the efficient implementation of graph algorithms for vector systems have been studied extremely poorly as of today: due to the highly irregular structure of real-world graphs, it is difficult to effectively use vector features of target platforms. This paper shows that the implementations of graph algorithms developed on the basis of the proposed VGL framework show the performance comparable to their manually optimized versions due to the encapsulation of a large number of graph algorithm optimizations typical for vector systems. At the same time, the proposed framework makes it possible to significantly simplify the process of developing graph algorithms for vector systems, by an order of magnitude reducing the amount of code for implemented algorithms and hiding the programming features of systems of this class from the user.


Published

2020-09-27

Issue

Section

Parallel software tools and technologies

Author Biography

I.V. Afanasyev


References

  1. F. McSherry, M. Isard, and D. G. Murray, “Scalability! but at What COST?,” In 15th Workshop in Proc. 15th Workshop on Hot Topics in Operating Systems, Kartause Ittingen, Switzerland, May 18-20, 2015 ,
    https://www.usenix.org/conference/hotos15 . Cited August 15, 2020.
  2. L. Backstrom, P. Boldi, M. Rosa, et al., “Four Degrees of Separation,” in Proc. 4th Annual ACM Web Science Conference, Web Science 2012 (Evanston, IL, USA) (ACM Press, New York, 2012), pp. 33-42.
  3. I. V. Afanasyev, Vad. V. Voevodin, Vl. V. Voevodin, et al., “Developing Efficient Implementations of Shortest Paths and Page Rank Algorithms for NEC SX-Aurora TSUBASA Architecture,” Lobachevskii J. Math. 40 (11), 1753-1762 (2019).
  4. I. V. Afanasyev, A. S. Antonov, D. A. Nikitenko, et al., “Developing Efficient Implementations of Bellman-Ford and Forward-Backward Graph Algorithms for NEC SX-ACE,” Supercomput. Front. Innov. 5 (3), 65-69 (2018).
  5. J. Shun and G. E. Blelloch, “Ligra: a Lightweight Graph Processing Framework for Shared Memory,” ACM SIGPLAN Notices 48 (8), 135-146 (2013).
  6. D. Nguyen, A. Lenharth, and K. Pingali, “A Lightweight Infrastructure for Graph Analytics,” in Proc. 24th ACM Symposium on Operating Systems Principles, Farmington, USA, November 3-6, 2013 (ACM Press, New York, 2013), pp. 456-471.
  7. Y. Wang, A. Davidson, Y. Pan, et al., “Gunrock: A High-Performance Graph Processing Library on the GPU,” in Proc. 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Barcelona, Spain, March 12-16, 2016 ACM SIGPLAN Notices 48 (2016).
    doi 10.1145/3016078.2851145
  8. F. Khorasani, K. Vora, R. Gupta, and L. N. Bhuyan, “CuSha: Vertex-Centric Graph Processing on GPUs,” in Proc. 23rd International Symposium on High-Performance Parallel and Distributed Computing, Vancouver, Canada, June 23-27, 2014 (ACM Press, New York, 2014), pp. 239-251.
  9. J. Zhong and B. He, “Medusa: Simplified Graph Processing on GPUs,” IEEE Trans. Parallel Distrib. Syst. 25 (6), 1543-1552 (2013).
  10. H. Liu and H. Howie Huang, “Enterprise: Breadth-First Graph Traversal on GPUs,” in Proc. Int. Conf. on High Performance Computing, Networking, Storage, and Analysis, Austin, USA, November 15-20, 2015 (IEEE Press, Piscataway, 2015),
    doi 10.1145/2807591.2807594
  11. K. Komatsu, S. Momose, Y. Isobe, et al., “Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA,” in Proc. Int. Conf. on High Performance Computing, Networking, Storage, and Analysis, Dallas, USA, November 11-16, 2018 (IEEE Press, Piscataway, 2018),
    doi 10.1109/SC.2018.00057
  12. Y. Yamada and S. Momose, “Vector Engine Processor of NEC’s Brand-New Supercomputer SX-Aurora TSUBASA,” in Proc. Int. Symposium on High Performance Chips, Cupertino, August 19-21, 2018 ,
    https://www.hotchips.org/hc30/2conf/2.14_NEC_vector_NEC_SXAurora_TSUBASA_HotChips30_finalb.pdf,
    Cited August 15, 2020.
  13. R. Egawa, K. Komatsu, S. Momose, et al., “Potential of a Modern Vector Supercomputer for Practical Applications: Performance Evaluation of SX-ACE,” J. Supercomput. 73, 3948-3976 (2017).
  14. K. Komatsu, R. Egawa, Y. Isobe, et al., “An Approach to the Highest Efficiency of the HPCG Benchmark on the SX-ACE Supercomputer,” in Proc. Int. Conf. on High Performance Computing, Networking, Storage, and Analysis, Austin, USA, November 15-20, 2015 ,
    http://sc15.supercomputing.org/sites/all/themes/ SC15images/tech_poster/poster_files/post277s2-file3.pdf . Cited August 15, 2020.
  15. I. V. Afanasyev, Vad. V. Voevodin, Vl. V. Voevodin, et al., “Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors,” in Lecture Notes in Computer Science (Springer, Heidelberg, 2019), Vol. 11657, pp. 125-139.
  16. M. Besta, M. Podstawski, L. Groner, et al., “To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations,” in Proc. 26th International Symposium on High-Performance Parallel and Distributed Computing, Washington DC, USA, June 26-30, 2017 New York: ACM Press, 2017. 93-104.
  17. D. Chakrabarti, Y. Zhan, and C. Faloutsos, “R-MAT: A Recursive Model for Graph Mining,” in Proc. 2004 SIAM International Conference on Data Mining, Orlando, USA, April 22-24, 2004 (SIAM Press, Philadelphia, 2004), pp. 442-446.
  18. J. Kunegis, “KONECT: The Koblenz Network Collection,” in Proc. Int. 22nd Conf. on World Wide Web, Rio de Janeiro, Brazil, May 13-17, 2013 (ACM Press, New York, 2013), pp. 1343-1350.
  19. Stanford Large Network Dataset Collection.
    https://snap.stanford.edu/data/. Cited August 15, 2020.
  20. R. C. Murphy, K. B. Wheeler, B. W. Barrett, and J. A. Ang, “Introducing the Graph 500,”
    http://www.richardmurphy.net/archive/cug-may2010.pdf . Cited August 15, 2020.