On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data

Logo poskytovatele

Varování

Publikace nespadá pod Ekonomicko-správní fakultu, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.
Autoři

PESCHEL Jakub BATKO Michal ZEZULA Pavel

Rok publikování 2022
Druh Článek ve sborníku
Konference 2022 IEEE International Symposium on Multimedia (ISM)
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
www https://ieeexplore.ieee.org/abstract/document/10019622
Doi http://dx.doi.org/10.1109/ISM55400.2022.00044
Klíčová slova Sequential Pattern Mining; GSP; SPAM; Prefix-span
Popis Sequential pattern mining, which is one of the core tasks in data mining, allows to gain insight into datasets with complex sequential data. As the task is computationally intensive, there are many different approaches that are suitable for various types of data. We explore the possibility of optimising the analysis of sequences based on the characteristic (quickly obtainable) properties of the analysed data. In this paper, we propose five such characteristics and explore the efficiency of three algorithms that are representatives of the three main approaches to sequential pattern mining. We discovered that it is possible to save up to 21% of the search time compared to the best-performing representative. We trained a decision tree model with 87% accuracy of choosing the best algorithm for selected data based on these characteristics.
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.