A parallel data clustering algorithm for Intel MIC accelerators


  • T.V. Rechkalov
  • M.L. Zymbler


parallel algorithm
Intel Xeon Phi
data layout
vectorization of computations


The PAM (Partitioning Around Medoids) is a partitioning clustering algorithm where each cluster is represented by an object from the input dataset (called a medoid). The medoid-based clustering is used in a wide range of applications: the segmentation of medical and satellite images, the analysis of DNA microarrays and texts, etc. Currently, there are parallel implementations of PAM for GPU and FPGA systems, but not for Intel Many Integrated Core (MIC) accelerators. In this paper, we propose a novel parallel PhiPAM clustering algorithm for Intel MIC systems. Computations are parallelized by the OpenMP technology. The algorithm exploits a sophisticated memory data layout and loop tiling technique, which allows one to efficiently vectorize computations with Intel MIC. Experiments performed on real data sets show a good scalability of the algorithm.





Section 1. Numerical methods and applications

Author Biographies

T.V. Rechkalov

South Ural State University
• PhD Student

M.L. Zymbler

South Ural State University
• Head of Department


  1. V. V. Voevodin and Vl. V. Voevodin, The Parallel Computing (BHV-Petersburg, St. Petersburg, 2002).
  2. Hardware Specifications of the Siberian Supercomputing Center. . Cited April 5, 2019.
  3. T. V. Rechkalov and M. L. Zymbler, “A Parallel Algorithm of Euclidean Distance Matrix Computation for the Intel Xeon Phi Knights Landing Many-Core Processor,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 7 (3), 65-82 (2018).
  4. D. F. Bacon, S. L. Graham, and O. J. Sharp, “Compiler Transformations for High-Performance Computing,” ACM Comput. Surv. 26 (4), 345-420 (1994).
  5. G. Chrysos, “Intel Xeon Phi Coprocessor (Codename Knights Corner),” in Proc. 2012 IEEE Hot Chips 24th Symposium (HCS), Cupertino, USA, August 27-29, 2012 (IEEE Press, New York, 2012),
    doi 10.1109/HOTCHIPS.2012.7476487
  6. A. Duran and M. Klemm, “The Intel Many Integrated Core Architecture,” in Proc. 2012 Int. Conf. on High Performance Computing and Simulation, Madrid, Spain, July 2-6, 2012 (IEEE Press, New York, 2012), pp. 365-366.
  7. E.-S.M. El-Alfy, “Detection of Phishing Websites Based on Probabilistic Neural Networks and K-Medoids Clustering,” Comput. J. 60 (12), 1745-1759 (2017).
  8. J. M. Engreitz, B. J. Daigle, J. J. Marshall, and R. B. Altman, “Independent Component Analysis: Mining Microarray Data for Fundamental Human Gene Expression Modules,” J. Biomed. Inform. 43 (6), 932-944 (2010).
  9. J. Espenshade, A. Pangborn, G. von Laszewski, et al., “Accelerating Partitional Algorithms for Flow Cytometry on GPUs,” in Proc. IEEE Int. Symp. on Parallel and Distributed Processing with Applications, Chengdu, Sichuan, China, August 10-12, 2009 (IEEE Press, New York, 2009), pp. 226-233.
  10. M. Jaros, P. Strakos, T. Karásek, et al., “Implementation of K-means Segmentation Algorithm on Intel Xeon Phi and GPU: Application in Medical Imaging,” Adv. Eng. Software 103, 21-28 (2017).
  11. J. Jeffers and J. Reinders, Intel Xeon Phi Coprocessor High Performance Programming (Morgan Kaufmann, Boston, 2013).
  12. L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An introduction to Cluster Analysis (Wiley, New York, 1990).
  13. K. J. Kohlhoff, M. H. Sosnick, W. T. Hsu, et al., “CAMPAIGN: An Open-Source Library of GPU-Accelerated Data Clustering Algorithms,” Bioinformatics 27 (16), 2321-2322 (2011).
  14. K. R. Kurte and S. S. Durbha, “High Resolution Disaster Data Clustering Using Graphics Processing Units,” in Proc. 2013 IEEE Int. Geoscience and Remote Sensing Symposium, Melbourne, Australia, July 21-26, 2013 (IEEE Press, New York, 2013), pp. 1696-1699.
  15. S. Lee, W.-K. Liao, A. Agrawal, et al., “Evaluation of K-Means Data Clustering Algorithm on Intel Xeon Phi,” in Proc. 2016 IEEE Int. Conf. on Big Data, Washington DC, USA, December 5-8, 2016 (IEEE Press, New York, 2016), pp. 2251-2260.
  16. K. Bache and M. Lichman, Individual Household Electric Power Consumption Dataset (Univ. of California, Irvine, 2013).
  17. S. P. Lloyd, “Least Squares Quantization in PCM,” IEEE Trans. Inform. Theory 28 (2), 129-136 (1982).
  18. T. Mattson, “Introduction to OpenMP,” in Proc. 2006 ACM/IEEE Conf. on Supercomputing, Tampa, USA, November 11-17, 2006 (ACM Press, New York, 2006),
    doi 10.1145/1188455.1188673
  19. N. N. Mohammed and A. M. Abdulazeez, “Evaluation of Partitioning Around Medoids Algorithm with Various Distances on Microarray Data,” in Proc. of the 2017 IEEE Int. Conf. on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Exeter, United Kingdom, June 21-23, 2017 (IEEE Press, New York, 2017), pp. 1011-1016.
  20. H. Mushtaq, S. G. Khawaja, M. U. Akram, et al., “A Parallel Architecture for the Partitioning Around Medoids (PAM) Algorithm for Scalable Multi-Core Processor Implementation with Applications in Healthcare,” Sensors 18 (2018).
    doi 10.3390/s18124129
  21. P. T. Nguyen, K. Eckert, A. Ragone, and T. D. Noia, “Modification to K-Medoids and CLARA for Effective Document Clustering,” in Lecture Notes in Computer Science (Springer, Cham, 2017), Vol. 10352, pp. 481-491.
  22. T. Rechkalov and M. Zymbler, “Accelerating Medoids-Based Clustering with the Intel Many Integrated Core Architecture,” in Proc. 9th Int. Conf. on Application of Information and Communication Technologies, Rostov-on-Don, Russia, October 14-16, 2015 (IEEE Press, New York, 2015), pp. 413-417.
  23. A. Sodani, “Knights Landing (KNL): 2nd Generation Intel Xeon Phi Processor,” in Proc. 2015 IEEE Hot Chips 27th Symposium (HCS), Cupertino, USA, August 22-25, 2015 (IEEE Press, New York, 2015),
    doi 10.1109/HOTCHIPS.2015.7477467
  24. I. Sokolinskaya and L. Sokolinsky, “Revised Pursuit Algorithm for Solving Non-stationary Linear Programming Problems on Modern Computing Clusters with Manycore Accelerators,” in Communications in Computer and Information Science (Springer, Cham, 2016), Vol. 687, pp. 212-223.
  25. R. L. Thorndike, “Who Belongs in the Family?,” Psychometrika 18 (4), 267-276 (1953).
  26. F. Wu, Q. Wu, Y. Tan, et al., “A Vectorized K-Means Algorithm for Intel Many Integrated Core Architecture,” in Lecture Notes in Computer Science (Springer, Heidelberg, 2013), Vol. 8299, pp. 277-294.