Measuring Similarity between Sequential Datasets
Similarity measurement is a basic problem in data mining,but little work focuses on the similarity between sequential datasets.We propose the density-emerging pattern.And we propose a novel similarity measurement between sequential datasets based on the quality of shared-density-aware and shared-emerging patterns.Similarity measuring can be di-vided into three stages,i.e.,pattern mining,evaluating the quality of patterns,and evaluating similarity.We performed experiments on real protein sequence datasets to test the effectiveness and efficiency of our method.A case study of sequential data set classification was carried out and high accuracy was obtained.The results show that our method is able to be effectively used in the classification of sequential datasets.
Sequential Datasets Similarity Density-Aware Pattern
Xiaohui Zhang Jie Zuo
Sichuan University Chengdu,China
国际会议
2019国图灵大会(ACM Turing Celebration conference-China 2019 )
成都
英文
53-57
2019-05-17(万方平台首次上网日期,不代表论文的发表时间)