Analysis of the Degree of Importance of Information Using Newspapers and Questionnaires
Our objective is to estimate and clarify the factors that determine the degree of importance of information by extracting the words that characterize the degree of importance and to construct a system for automatically estimating this degree of importance. We studied the degree of importance of information by using machine learning. We first performed experiments using newspaper documents (Dn). In this experiment, we assumed that a document on the front page or at the top of the front page is important. We were able to identify important documents with a precision of 0.9 by using machine learning. We found that in the case of a newspaper, the degree of importance can be estimated with high precision. Next, to estimate the degree of importance that people attach to a document, we conducted experiments using questionnaire data (Dq) as test data. In these experiments, the subjects were asked to identify which document from a pair was more important, and a high accuracy of 94% was obtained with more than 80% of them responding with the same answer. Furthermore, on using newspaper documents (Dn) as training data, we could obtain (i) the same accuracy by using Dn only instead of using Dn with Dq and (ii) a higher accuracy on using Dn and Dq instead of using Dq only. This observation is useful because preparing questionnaire data (Dq) can be an expensive process, whereas (Dn) is free. Finally, we extracted the characteristic words that differentiated important information from less important information by calculating the parameters of the features in machine learning (maximum entropy (ME) method).
Degree of importance of information newspaper questionnaire machine learning analysis
Masaki MURATA Ryo NISHIMURA Kouichi DOI Toshiyuki KANAMARU Kentaro TORISAWA
NICT Seika,Kyoto,Japan Ryukoku University Otsu,Shiga,Japan PSC Inc.Chiyoda,Tokyo,Japan Kyoto University Sakyo,Kyoto,Japan
国际会议
北京
英文
2008-10-19(万方平台首次上网日期,不代表论文的发表时间)