Implementation of a nonsymmetric linear solver on GPU

Authors

  • S.N. Chadov

Keywords:

parallel computing
sparse linear solver
GPGPU
CUDA
BiCG-STAB

Abstract

An implementation of a nonsymmetric sparse linear solver on GPU is considered. The solver uses a version of the BiCG-STAB algorithm. This algorithm is shortly described. Several sparse matrix storage formats are given with consideration of the NVIDIA GPGPU hardware features. The performance of the implementation is analyzed compared to the performance of a similar algorithm run on contemporary CPUs. The impact of several factors on the performance is discussed with some suggestions on further development.


Published

2009-09-06

Issue

Section

Section 1. Numerical methods and applications

Author Biography

S.N. Chadov

Ivanovo State Power University named аfter V. I. Lenin,
Faculty of Informatics and Computer Engineering
• PhD Student


References

  1. Krüger J., Westermann R. Linear algebra operators for gpu implementation of numerical algorithms // ACM Transactions on Graphics. 2003. 22, N 3. 908-916.
  2. Bolz J., Farmer I., Grinspun E., Schröder P. Sparse matrix solvers on the gpu: conjugate gradients and multigrid // ACM Transactions on Graphics. 2003. 22, N 3. 917-924.
  3. Buatois L. et al. Concurrent number cruncher: a gpu implementation of a general sparse linear solver // International Journal of Parallel, Emergent and Distributed Systems (to appear).
  4. Barrett R. et al. Templates for the solution of linear systems: building blocks for iterative methods. Philadelphia: SIAM, 1994.
  5. Grote M., Huckle T. Parallel preconditioning with sparse approximate inverses // SIAM Journal on Scientic Computing. 1997. 18, N 3. 838-853.
  6. NVIDIA. CUDA Programming Guide, 2.1 edition. 2009.
  7. Baskaran M., Bordawekar R. Optimizing sparse matrix-vector multiplication on GPUs. IBM Tech. Rep. 2009.
  8. Bell N., Garland M. Efficient sparse matrix-vector multiplication on cuda. NVIDIA Tech. Rep. 2008.