会议专题

Discretization of Time Series Dataset Using Relative Frequency and K-Nearest Neighbor Approach

In this work, we propose an improved approach of time series data discretization using the Relative Frequency and K- nearest Neighbor functions called the RFknn method. The main idea of the method is to improve the process of determining the sufficient number of intervals for discretization of time series data. The proposed approach improved the time series data representation by integrating it with the Piecewise Aggregate Approximation (PAA) and the Symbolic Aggregate Approximation (SAX) representation. The intervals are represented as a symbol and can ensure efficient mining process where better knowledge model can be obtained without major loss of knowledge. The basic idea is not to minimize or maximize the number of intervals of the temporal patterns over their class labels. The performance of RFknn is evaluated using 22 temporal datasets and compared to the original time series discretization SAX method with similar representation. We show that RFknn can improve representation preciseness without losing symbolic nature of the original SAX representation. The experimental results showed that RFknn gives better term of representation with lower and comparable error rates.

Data mining discretization reduction pre-processing and time series representation dynamic intervals

Azuraliza Abu Bakar Almahdi Mohammed Ahmed Abdul Razak Hamdan

Center for Artificial Intelligence Technology,Faculty of Information Science and Technology,University Kebangsaan Malaysia 43600 Bangi,Selangor Darul Ehsan Malaysia

国际会议

6th International Conference on Advanced Data Mining and Applications(第六届先进数据挖掘及应用国际会议 ADMA 2010)

重庆

英文

193-201

2010-11-19(万方平台首次上网日期,不代表论文的发表时间)