Implementation of an automated performance analysis tool for UPC applications
Authors
-
N.E. Andreev
-
K.E. Afanasiev
Keywords:
PGAS
Unified Parallel C
Scalasca
instrumentation
tracing
performance analysis
parallel programming
Abstract
An implementation of a performance analysis tool for applications written in Unified Parallel C is considered. This language belongs to the PGAS programming model being a new parallel programming paradigm. The main feature of the instrument is the automated analysis support, which differs it from the existing tools. A number of components of the Scalasca package were used during the implementation process.
Section
Section 2. Programming
References
- Wolf F., Mohr B. Specifying performance properties of parallel applications using compound events // Parallel and Distributed Computing Practices. 2001. 4, N 3. 301-317.
- Geimer M., Wolf F., Wylie B., Abraham E., Becker D., Mohr B. The Scalasca performance toolset architecture // Concurrency and Computation: Practice and Experience. 2010. 22, N 6. 702-719.
- High Productivity Computing Systems Program [Electronic resource] (http://www.highproductivity.org/).
- Bell C., Bonachea D., Nishtala R., Yelick K. Optimizing bandwidth limited problems using one-sided communication and overlap // Proc. 20th International Parallel &; Distributed Processing Symposium. Rhodes Island, 2006.
doi 10.1109/IPDPS.2006.1639320
- El-Ghazawi T. UPC Language Specifications [Electronic resource] (verb"http://www.gwu.edu/ upc/documentation.html").
- LBNL, UC Berkeley. Berkeley UPC User’s Guide [Electronic resource] // (verb"http://upc.lbl.gov/docs/user/index.shtml").
- Su H.H., Billingsley M., George A. Parallel performance wizard: a performance analysis tool for partitioned global-address-space programming // Conference on Supercomputing. Miami, 2006. 1-8.
- Leko A. Performance Analysis Strategies [Electronic resource] // (http://www.hcs.ufl.edu/prj/upcgroup/upcperf/documents/20050302-AnalysisDraft.pdf).
- Wolf F., Mohr B., Bhatia N., Hermanns M.A., Geimer M. EPILOG binary trace-data format. Tech. Rep. FZJ-ZAM-IB-2004-06. Forschungszentrum Julich, University of Tennessee, 2004.
- Su H., Bonachea D., Leko A., Sherburne H., Billingsley III M., George A. GASP! A standardized performance analysis tool interface for global address space programming models // Proc. of Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA06). Umea, Sweden, June 18-21, 2006. 450-459.
- Leko A., Sherburne H., Su H., Golden B., George A.D. Practical Experiences with Modern Parallel Performance Analysis Tools: An Evaluation [Electronic resource] // (http://www.hcs.ufl.edu/upc/archive/toolevals/WhitepaperEval-Summary.pdf).
- Wolf F., Mohr B. EARL - a programmable and extensible toolkit for analyzing event traces of message passing programs // Proc. of the 7th International Conference on High Performance Computing and Networking Europe (HPCN). Amsterdam, 1999. 503-512.
- Wolf F., Bhatia N. EARL - API documentation: high-level trace access library. Tech. Rep. ICL-UT-04-03. Forschungszentrum Julich, University of Tennessee, 2004.
- Wolf F., Song F. CUBE - User Manual. Tech. Rep. ICL-UT-04-01. Forschungszentrum Julich, University of Tennessee, 2004.
- Корж А.А. Результаты масштабирования бенчмарка NPB UA на тысячи ядер суперкомпьютера Blue Gene/P с помощью PGAS-расширения OpenMP // Вычислительные методы и программирование. 2010. 11, № 1. 164-174.
- Андрюшин Д.В., Семенов А.С. Исследование реализации алгоритма Survey Propagation для решения задачи выполнимости функций булевых переменных (SAT-задача) на языке UPC // Тр. Международной суперкомпьютерной конференции «Научный сервис в сети Интернет: суперкомпьютерные центры и задачи». М.: Изд-во Моск. ун-та, 2010. 133-135.
- Johnson A. Unified parallel C within computational fluid dynamics applications on the Cray X1 // Proc. of the Cray User’s Group Conference. Albuquerque, 2005. 1-9.
- Beech-Brandt J. Applications of UPC [Электронный ресурс] // (http://www.nesc.ac.uk/talks/892/applicationsofupc.pdf).
- Gordon B., Nguyen N. Overview and Analysis of UPC as a Tool in Cryptanalysis. Tech. Rep. FL 32611. High-performance Computing and Simulation (HCS) Research Laboratory, Department of Electrical and Computer Engineering, University of Florida, 2003.
- Cristian F. Probabilistic clock synchronization // Distributed Computing. Berlin: Springer Verlag, 1998. 146-158.
- Hoefler T., Schneider T., Lumsdaine A. Characterizing the influence of system noise on large-scale applications by simulation // Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC’10). New Orleans, 2010. 1-11.
- Rabenseifner R. The controlled logical clock - a global time for trace based software monitoring of parallel applications in workstation clusters // Proc. of the Fifth Euromicro Workshop on Parallel and Distributed (PDP’97). London, 1997. 477-484.
- Андреев Н.Е., Афанасьев К.Е. Автоматизированный анализ производительности параллельных программ как способ повышения продуктивности разработчика // Информационные технологии и математическое моделирование (ИТММ-2010). Материалы IX Всероссийской научно-практической конференции c международным участием. Анжеро-Судженск, 19-20 ноября 2010 г. Томск: Томский гос. ун-т, 2010. Ч. 2. 121-125.
- Андреев Н.Е., Афанасьев К.Е. Набор шаблонов неэффективного поведения для программной модели PGAS на примере языка UPC // Вычислительные технологии (в печати).
- Андреев Н.Е. Методы автоматизированного анализа производительности параллельных программ // Вестник Новосибирского государственного университета. 2009. 7, № 1. 16-25.
- Bonachea D. Proposal for Extending the UPC Memory Copy Library Functions, v2.0 [Electronic resource] // (http://upc.lbl.gov/publications/upc_memcpy.pdf).