会议专题

Entropy-based Sequence Clustering Algorithm for Analyzing Software Fault Feature

Sequence clustering is significant for analyzing software fault The existing similarity measures of sequence clustering are inexact for clustering software fault. In this paper, a software fault feature clustering algorithm called ECA is proposed. In ECA the similarity of fault sequence is defined by global and local similarity measure (CLSM) which considers both the items contained in sequence and the order of items occurrence; The clusters are collected according to the entropy of sequences that is computed by global and local similarity. The sequence with the smallest entropy is selected as the centroid of each clustering, and then the clusters are obtained based on the largest similarity between the unselected sequence and the clustering centroid. The optimal number of clusters is determined by the average silhouette coefficient. In order to analyze the fault type, the sequences to be analyzed are matched to each cluster and classed into the most similar cluster. Experimental results show that ECA improves the precision of clustering and reduces the matching scope of the software fault feature.

software fault feature sequence entropy clustering

Yanyan Wang Jiadong Ren Jiaxin Liu Jiadong Ren Yanning Wang

College of Information Science and Engineering Yanshan University Qinhuangdao City, China School of Computer Science and Technology Beijing Institute of Technology Beijing City, China College of Sciences Yanshan University Qinhuangdao City, China

国际会议

2010 International Conference on Information Security and Artificial Intelligence(2010年信息安全与人工智能国际会议 ISAI 2010)

成都

英文

793-797

2010-12-17(万方平台首次上网日期,不代表论文的发表时间)