DOI: https://doi.org/10.26089/NumMet.v27r104

Modifications of the ant colony matrix method for parametric optimization using CUDA-based GPUs

Authors

  • Yurii P. Titov

Keywords:

ACO
parametric problem
optimization
CUDA
Tesla V100

Abstract

This article focuses on the development and investigation of the matrix-based modifications of the Ant Colony Optimization (ACO) method for solving parametric optimization problems using CUDA-based Graphics Processing Units (GPUs). Unlike most existing studies focusing on the Traveling Salesman Problem (TSP), the proposed method is specialized for finding optimal values of independent parameters, which leads to a different graph structure and requires the development of novel parallelization schemes. Original parallelization strategies are proposed, including a three-stage decomposition of the algorithm and introduction of “parameter-level” parallelism in the problem dimensions. Various strategies for parallelization and memory management were developed and tested: from the basic MatrixACO-Basic modification to the optimized versions of MatrixACO-Optimization (featuring warp reduction and optimized hash-table handling) and MatrixACO-Transported (utilizing transposed data storage). Experiments were conducted on NVIDIA Tesla V100 16GB SXM2 GPU, the Jetson ORIN platform, and various personal computers using multi-dimensional test functions. Results demonstrate that for a problem with 128 parameters the optimized version of MatrixACO-Optimization provides an execution time of 0.13 ms per graph layer, which corresponds to a speed-up of more than 30 times compared to the optimized CPU implementation based on OpenMP, and in the case of a very large-scale problem with 65536 parameters it speeds up by almost 45 times. For a problem with 512 parameters, the GPU realization shows 76% higher energy efficiency compared to the CPU version, as well as an almost eightfold (7.9 times) reduction in Energy-Delay Product. The study confirms the necessity of developing specialized GPU implementations for parametric tasks and reveals the excellent scalability of the proposed algorithms.



Downloads

Published

2026-02-27

Issue

Section

Parallel software tools and technologies

Author

Yurii P. Titov

MAI

• Associate Professor


References

  1. A. Colorni, M. Dorigo and M. Vittorio, “Distributed Optimization by Ant Colonies,” Proceedings of the First European Conference on Artificial Life. Paris, France, January 1991 (Elsevier Publishing, 1991), pp. 134–142.
  2. M. Dorigo and L. M. Gambardella, “Ant Colony System: a Cooperative Learning Approach to the Traveling Salesman Problem,” Evolutionary Computation, IEEE Transactions. textbf1.1, 53–66 (1997).
    doi 10.1109/4235.585892
  3. B. Bullnheimer, R. Hartl and C. Strauss, “An improved Ant System Algorithm for the Vehicle Routing Problem,” Annals of Operations Research 89, 319–328 (1999).
    doi 10.1023/A: 1018940026670.
  4. L. M. Gambardella, E. D. Taillard and M. Dorigo, “Ant colonies for the quadratic assignment problem,” Journal of the Operational Research Society. 50 (2), 167–176 (1999).
  5. K. L. Huang and C. J. Liao, “Ant Colony Optimization Combined with Taboo Search for the Job Shop Scheduling Problem,” Computers & Operations Research. 35 (4), 1030–1046 (2008).
    doi 10.1016/j.cor.2006.07.003
  6. M. Dorigo, V. Maniezzo and A. Colorni, “Positive Feedback as a Search Strategy,” In Technical Report 91-016, Politecnico di Milano, Italy, 1991.
  7. L. M. Gambardella and M. Dorigo, “Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem,” In Machine Learning Proceedings 1995, Morgan Kaufmann , pp. 252–260.
    doi 10.1016/B978-1-55860-377-6.50039-6
  8. B. Bullnheimer, R. F. Hartl and C. Strauss, “Applying the Ant System to the Vehicle Routing Problems,” Annals of Operations Research. 79, 109–123 (1998).
    doi 10.1007/978-1-4615-5775-3_20
  9. T. Stützle and H. H. Hoos, “MAX–MIN Ant System,” Future Generation Computer Systems. 16 (8), 889–914 (2000).
    doi 10.1016/S0167-739X(00)00043-1
  10. O. Cordon, I. F. de Viana, F. Herrera, “Analysis of the best-worst ant system and its variants on the TSP,” Soft Computing. 9 (2–3), 177–192 (2002).
  11. M. Randall, A. A. Lewis, “A Parallel Implementation of Ant Colony Optimization,” Journal of Parallel and Distributed Computing. 62 (9), 1421–1432 (2002).
    doi 10.1006/jpdc.2002.1854
  12. M. Dorigo, T. Stützle, Ant Colony Optimization(MIT Press, Cambridge, Massachusetts, 2004).
  13. H. Bai, D. OuYang, X. Li, at al., “MAXMIN Ant System on GPU with CUDA,” In 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC), Kaohsiung, Taiwan, December 7–9, 2009 (IEEE Computer Society), pp. 801–804.
    doi 10.1109/ICICIC.2009.255
  14. D. M. Chitty, “Applying ACO to Large Scale TSP Instances,” In: F. Chao et al. (Eds.) Advances in Computational Intelligence Systems. UKCI 2017 (Intelligent Systems and Computing, Springer, Cham), 650 (2018).
    doi 10.1007/978-3-319-66939-7_9
  15. R. Skinderowicz, “The GPU-based parallel Ant Colony System,” Journal of Parallel and Distributed Computing. 98, 48–60 (2016)
    doi 10.1016/j.jpdc.2016.04.014
  16. D. Zhang, X. You, S. Liu, K. Yang,” Multi-Colony Ant Colony Optimization Based on Generalized Jaccard Similarity Recommendation Strategy,” IEEE Access. 7, 157303-157317 (2019)
    doi 10.1109/ACCESS.2019.2949860
  17. R. Skinderowicz,” Implementing a GPU-based parallel MAX–MIN Ant System,” Future Generation Computer Systems. 106, 277–295 (2020)
    doi 10.1016/j.future.2020.01.011
  18. Z. B. Huan, G. T. Fu, T. H. Fa, at al.,” High Performance Ant Colony System Based on GPU Warp Specialization with a Static-Dynamic Balanced Candidate Set Strategy,” Future Generation Computer Systems. 125, 136–150 (2021).
    doi 10.1016/j.future.2021.06.041
  19. I. N. Sinitsyn, Y. P. Titov,” Control of Set of System Parameter Values by the Ant Colony Method,” Autom. Remote Control, 84, 893–903 (2023).
    doi 10.1134/S0005117923080106
  20. V. Sudakov, Y. Titov,” Matrix-Based ACO for Solving Parametric Problems Using Heterogeneous Reconfigurable Computers and SIMD Accelerators,” Mathematics, 13 (1284), (2025).
    doi 10.3390/math13081284
  21. V. A. Sudakov, Y. P. Titov,” Investigation of the Parametric Graph Model in the Ant Colony Method,” Math. Models Comput. Simul. 17, 126–136 (2025).
    doi 10.1134/S2070048224700996
  22. I. N. Sinitsyn, Y. P. Titov, “Investigation of algorithms for cyclic search for additional solutions when optimizing the order of hyperparameters by the ant colony method,” High Availability Systems. 19 (1), 59–73 (2023).
    http://radiotec.ru/en/journal/Highly_available_systems/number/2023-1/article/23354 Cited February 25, 2026.
  23. Y. P. Titov, Implementation of ACO Algorithm Modifications on CUDA for SIMD Compute Processor Operation [Electronic resource].
    https://github.com/kalengul/ACO_SIMD Cited January 23, 2026.
  24. S. K. Mishra,” Some New Test Functions for Global Optimization and Performance of Repulsive Particle Swarm Method,” University Library of Munich, Germany, MPRA Paper. 2718 (2006).
    https://mpra.ub.uni-muenchen.de/2718/ Cited February 25, 2026.
  25. B. N. Chetverushkin, V. A. Sudakov, Y. P. Titov,” Graph Condensation for Large Factor Models,” Dokl. Math. 109, 246–251 (2024).
    doi 10.1134/S1064562424702090