会议专题

A Comparison of Textual Data Mining Methods for Sex Identification in Chat Conversations

Mining textual data in chat mediums is becoming more important because these mediums contain a vast amount of information,which is potentially relevant to a societys current interests,habits,social behaviors,crime tendency and other tendencies.Here,sex identification is taken as a base study in information mining in chat mediums.In order to do this,a simple discrimination function and semantic analysis method are proposed for sex identification in Turkish chat mediums.Then,the proposed sex identification method is compared with the Support Vector Machine (SVM) and Naive Bayes (NB) methods.Finally,results show that the proposed system has achieved accuracy over90% in sex identification.

Mining Chat Conversations Sex Identification Information Extraction Text Mining Machine Learning

Cemal K(o)se (O)zcan (O)zyurt Cevat (I)kiba(s)

Department of Computer Engineering,Faculty of Engineering,Karadeniz Technical University,61080 Trabzon,Turkey

国际会议

4th Asia Information Retrieval Symposium(AIRS 2008)(第四届亚洲信息检索研讨会)

哈尔滨

英文

638-643

2008-01-16(万方平台首次上网日期,不代表论文的发表时间)