Discovery of Typical Subsequences of Time Series on Graphical Processor

Authors

DOI:

https://doi.org/10.26089/NumMet.v22r423

Keywords:

time series, typical subsequences, matrix profile, MPdist, parallel algorithm, GPU

Abstract

Discovery of typical subsequences in a time series is one of the topical problems of time series mining. In this problem, we are to find a set of subsequences that adequately represents the specified time series. The solution of such a problem makes it possible to summarize and visualize a large time series in a wide range of applications: monitoring of the technical condition of complex machines and mechanisms, intelligent management of life support systems, monitoring of indicators of functional diagnostics of the human body, etc. The recently proposed snippet concept formalizes a typical time series subsequence as follows. A snippet of a time series is a subsequence that many other subsequences of the given series are similar to, with respect to a specialized similarity measure based on the Euclidean distance. Despite the snippets discovery algorithm shows adequate results for time series from a wide range of subject domains, it has a high computational complexity. In this article, we propose a novel parallel algorithm for snippets discovery on GPU. Parallelization is performed through the CUDA programming technology. We developed data structures that allow for efficient parallelization of GPU calculations. The experimental results show the high performance of the proposed algorithm.

Author Biographies

Mikhail L. Zymbler

South Ural State University (National Research University)
• Dr. Sci., Associate Professor, Deputy Director of the Scientific and Educational Center “Artificial Intelligence and Quantum Technologies”

Andrey I. Goglachev

South Ural State University (National Research University)
• Programmer of the Data Mining and Virtualization Department of the Supercomputer Simulation Laboratory

References

  1. S. Imani, F. Madrid, W. Ding, et al., “Matrix Profile XIII: Time Series Snippets: A New Primitive for Time Series Data Mining,” in Proc. 9th IEEE Int. Conf. on Big Knowledge (ICBK). Singapore, November 17-18, 2018 , doi{10.1109/ICBK.2018.00058}.
  2. S. Kumar, P. Tiwari, and M. Zymbler, “Internet of Things is a Revolutionary Approach for Future Technology Enhancement: A Review,” J. Big Data 6 (2019). doi{10.1186/s40537-019-0268-2}.
  3. M. L. Zymbler, Ya. A. Kraeva, E. A. Latypova, et al., “Cleaning Sensor Data in Intelligent Heating Control System,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 10 (3), 16-36 (2021). doi{10.14529/cmse210302}.
  4. S. A. Ivanov, K. Yu. Nikolskaya, G. I. Radchenko, et al., “Digital Twin of a City: Concept Overview,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 9 (4), 5-23 (2020). doi{10.14529/cmse200401}.
  5. V. V. Epishev, A. P. Isaev, R. M. Miniakhmetov, et al., “Physiological Data Mining System for Elite Sports,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 2 (1), 44-54 (2013). doi{10.14529/cmse130105}.
  6. S. M. Abdullaev, O. U. Lenskaya, A. O. Gayazova, et al., “Short-Range Forecasting Algorithms Using Radar Data: Translation Estimate and Life-Cycle Composite Display,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 3 (1), 17-32 (2014). doi{10.14529/cmse140102}.
  7. M. M. Dyshaev and I. M. Sokolinskaya, “Representation of Trading Signals Based on Kaufman Adaptive Moving Average as a System of Linear Inequalities,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 2 (4), 103-108 (2013). doi{10.14529/cmse130408}.
  8. A. Mueen, E. J. Keogh, Q. Zhu, et al., “Exact Discovery of Time Series Motifs,” in Proc. 2009 SIAM Int. Conf. on Data Mining (SDM), Sparks, USA, April 30-May 2, 2009 , doi{10.1137/1.9781611972795.41}.
  9. L. Ye and E. J. Keogh, “Time Series Shapelets: a New Primitive for Data Mining,” in Proc. 15th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Paris, France, June 28-July 1, 2009 , doi{10.1145/1557019.1557122}.
  10. P. Indyk, N. Koudas, and S. Muthukrishnan, “Identifying Representative Trends in Massive Time Series Data Sets Using Sketches,” in Proc. 26th Int. Conf. on Very Large Data Bases (VLDB), Cairo, Egypt, September 10-14, 2000 ,
    https://www.vldb.org/conf/2000/P363.pdf . Cited December 18, 2021.
  11. S. Gharghabi, S. Imani, A. Bagnall, et al., “An Ultra-fast Time Series Distance Measure to Allow Data Mining in More Complex Real-world Deployments,” Data Min. Knowl. Discov. 34, 1104-1135 (2020). doi{10.1007/s10618-020-00695-8}.
  12. M. L. Zymbler and Ya. A. Kraeva, “Discovery of Time Series Motifs on Intel Many-Core Systems,” Lobachevskii J. Math. 40 (12), 2124-2132 (2019). doi{10.1134/S199508021912014X}.
  13. M. L. Zymbler and Ya. A. Kraeva, “Parallel Algorithm for Time Series Motif Discovery on Graphic Processor,” Vestn. Yuzhn. Ural. Gos. Univ. Ser. Vychisl. Mat. Inf. 9 (3), 17-34 (2020). doi{10.14529/cmse200302}.
  14. E. Keogh and J. Lin, “Clustering of Time-Series Subsequences is Meaningless: Implications for Previous and Future Research,” Knowl. Inf. Syst. 8, 154-177 (2005). doi{10.1007/s10115-004-0172-7}.
  15. A. Reiss and D. Stricker, “Introducing a New Benchmarked Dataset for Activity Monitoring,” in Proc. 16th Int. Symposium on Wearable Computers (ISWC), Newcastle, United Kingdom, June 18-22, 2012 , doi{10.1109/ISWC.2012.13}.
  16. C.-C. M. Yeh, Y. Zhu, L. Ulanova, et al., “Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets,” in Proc. IEEE 16th Int. Conf. on Data Mining (ICDM), Barcelona, Spain, December 12-15, 2016 , doi{10.1109/ICDM.2016.0179}.
  17. D. Kirk, “NVIDIA CUDA Software and GPU Parallel Computing Architecture,” in Proc. 6th Int. Symposium on Memory Management (ISMM’07), Montreal, Canada, October 21-22, 2007 , doi{10.1145/1296907.1296909}.
  18. Z. Zimmerman, K. Kamgar, N. S. Senobari, et al., “Matrix Profile XIV: Scaling Time Series Motif Discovery with GPUs to Break a Quintillion Pairwise Comparisons a Day and Beyond,” in Proc. {ACM Symposium on Cloud Computing (SoCC’19), Santa Cruz, USA, November 20-23, 2019}, doi{10.1145/3357223.3362721}.

Published

19-12-2021

How to Cite

Цымблер М. Л., Гоглачев А. И. Discovery of Typical Subsequences of Time Series on Graphical Processor // Numerical Methods and Programming (Vychislitel’nye Metody i Programmirovanie). 2021. 22. 344-359. doi 10.26089/NumMet.v22r423

Issue

Section

Parallel software tools and technologies