Automatic selection of the fastest algorithm implementations
Keywords:
selection of algorithm implementation
program running time
asymptotic estimates of complexity
characteristics of computing systems
machine learning
regression analysis
Abstract
We propose an approach for the runtime prediction of distributed high-performance computation. This approach does not require experimentation on all target computer systems. The selection of an optimal algorithm is performed according to the asymptotic complexity of the algorithms under evaluation using machine learning methods. The proposed approach can significantly reduce the number of experiments and the dimension of the problems to be solved during the process of evaluating the performance of a computer system. The evaluation of algorithm execution time based on the known parameters of the system allows determining the computer system efficiency for solving certain classes of problems without performing experiments on it. This allows one to quickly update the forecast by a minimum number of experiments with small-size tasks on the target computer system. The proposed solution can be used for the automatic library turning before using it (like the autosetting in the ATLAS (Automatically Tuned Linear Algebra Software) library. A comparative analysis of runtime prediction results obtained when solving several problems on 84 computers is given. The use of a random forest combined with the linear least square method shows the average relative error of the estimated execution time 17% for the training data corresponding to the problems of small dimension and the average relative error 9% when training was performed on data from the entire range of algorithm parameters in the test samples. The resulting estimates allow one to select the most efficient implementation of the algorithm in more than 80% of cases.
Section
Section 1. Numerical methods and applications
References
- Rice J.R. The algorithm selection problem // Advances in Computers. 1976. 15. 65-118.
- Fink E. How to solve it automatically: selection among problem-solving methods // Proc. of the Fourth International Conference on Artificial Intelligence Planning Systems. Palo Alto: AAAI Press, 1998. 128-136.
- Roberts M., Howe A. Learning from planner performance // Artificial Intelligence. 2009. 173, N 5-6. 536-561.
- Howe A.E., Dahlman E., Hansen C., Scheetz M., von Mayrhauser A. Exploiting competitive planner performance //
- Gomes C.P., Selman B., Crato N., Kautz H. Heavy-tailed phenomena in satisfiability and constraint satisfaction problems // Journal of Automated Reasoning. 2000. 24, N 1. 67-100.
- Xu L., Hutter F., Hoos H.H., Leyton-Brown K. SATzilla2009: an automatic algorithm portfolio for SAT. Solver description. SAT competition 2009 (http://www.satcompetition.org/2009/spec2009.html).
- Gagliolo M., Schmidhuber J. Learning dynamic algorithm portfolios // Annals of Mathematics and Artificial Intelligence. 2006. 47, N 3-4. 295-328.
- Hutter F., Xu L., Hoos H.H., Leyton-Brown K. Algorithm runtime prediction: methods &; evaluation // Artificial Intelligence. 2014. 206. 79-111.
- Brewer E.A. High-level optimization via automated statistical modeling // Proc. of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP-95). Vol. 30, Issue 8. New York: ACM Press, 1995. 80-91.
- Hastie T., Tibshirani R., Friedman J. The elements of statistical learning: data mining, inference, and prediction. New York: Springer, 2001.
- Rasmussen C.E., Williams C.K. I. Gaussian processes for machine learning. Cambridge: MIT Press, 2006.
- Kotthoff L., Gent I.P., Miguel I. An evaluation of machine learning in algorithm selection for search problems // Artificial Intelligence Communications. 2012. 25, N 3. 257-270.
- Charnes A., Frome E.L., Yu P.L. The equivalence of generalized least squares and maximum likelihood estimates in the exponential family // Journal of the American Statistical Association. 1976. 71, N 353. 169-171.
- Breiman L. Random forests // Machine Learning. 2001. 45, N 1. 5-32.
- Press W.H., Teukolsky S.A., Vetterling W.T., Flannery B.P. Numerical recipes: the art of scientific computing. New York: Cambridge Univ. Press, 2007.
- Broomhead D.S., Lowe D. Multivariable functional interpolation and adaptive networks // Complex Systems. 1988. 2, N 3. 321-355.
- Marcus G.F. Rethinking eliminative connectionism // Cognitive Psychology. 1998. 37, N 3. 243-282.
- Pin - A Dynamic Binary Instrumentation Tool. (https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool; дата обращения: 22.08.2014).
- Goto K., van de Geijn R.A. Anatomy of high-performance matrix multiplication // ACM Transactions on Mathematical Software. 2008. 34, N 3. 1-25.
- Cormen T.H., Leiserson C.E., Rivest R.L., Stein C. Introduction to algorithms. Cambridge: MIT Press, 2009.
- Дрейпер Н., Смит Г. Прикладной регрессионный анализ. М.: Издательский дом «Вильямс», 2007.
- Intel extregistered 64 and IA-32 Architectures Software DeveloperТs Manual. Volume 2A: Instruction Set Reference, A-M. 2014 (http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-2a-manual.html; дата обращения: 22.08.2014).
- Agner F. Instruction tables: lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs (http://www.agner.org/optimize/instruction_tables.pdf; дата обращения 17.07.2014).
- Intel Math Kernel Library 11.1 (https://software.intel.com/en-us/intel-mkl; дата обращения: 6.08.2014).
- OpenBLAS 0.2.9 (http://www.openblas.net; дата обращения: 15.07.2014).
- Intel Threading Building Blocks 4.2 (https://www.threadingbuildingblocks.org; дата обращения: 6.08.2014).
- FFTW 3.3.4 (http://www.fftw.org; дата обращения: 6.08.2014).
- randomForest: Breiman and Cutler’s random forests for classification and regression (http://cran.r-project.org/web/packages/randomForest/index.html; дата обращения: 6.08.2014).
- Dongarra J.J., Luszczek P., Petitet A. The LINPACK benchmark: past, present and future // Concurrency and Computation: Practice and Experience. 2003. 15, N 9. 803-820.
- NAS Parallel Benchmarks (http://www.nas.nasa.gov/publications/npb.html; дата обращения: 22.08.2014).
- Graph 500 Benchmark (http://www.graph500.org/specifications; дата обращения: 22.08.2014).
- Automatically Tuned Linear Algebra Software (ATLAS) (http://math-atlas.sourceforge.net; дата обращения: 15.07.2014).
- Гергель В.П., Сиднев А.А. Методы и программные средства макромодульной разработки программ // Вестник Нижегородского университета им. Н.И. Лобачевского. 2012. № 2. 294-300.