Enhanced Hierarchical Classification via Isotonic Smoothing

摘要：

Hierarchical topic taxonomies have proliferated on theWorldWide Web 5, 18, and exploiting the output space decompositions they induce in automated classification systems is an active area of research. In many domains, classifiers learned on a hierarchy of classes have been shown to outperform those learned on a flat set of classes. In this paper we argue that the hierarchical arrangement of classes leads to intuitive relationships between the corresponding classifiers’ output scores, and that enforcing these relationships as a post-processing step after classification can improve its accuracy. We formulate the task of smoothing classifier outputs as a regularized isotonic tree regression problem, and present a dynamic programming based method that solves it optimally. This new problem generalizes the classic isotonic tree regression problem, and both, the new formulation and algorithm, might be of independent interest. In our empirical analysis of two real-world text classification scenarios, we show that our approach to smoothing classifier outputs results in improved classification accuracy.

关键词： Hierarchical Classification Taxonomy Regualrized Isotonic Regression Dynamic Programming

作者: Kunal Punera Joydeep Ghosh

作者单位: Yahoo! Research 701 First Ave.Sunnyvale, CA 94089 University of Texas at Austin Austin, TX 78712

会议类型: 国际会议

会议名称: 第十七届国际万维网大会(the 17th International World Wide Web Conference)(WWW08)

会议地点: 北京

会议语种:英文

在线出版日期: 2008-04-21（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Enhanced Hierarchical Classification via Isotonic Smoothing