会议专题

An Effective Feature Representation of web log data by Leveraging Byte Pair Encoding and TF-IDF

  Web log data analysis is important in intrusion detection.Various machine learning techniques have been applied.However,com-pared to abundant researches on machine learning,ways to extract features from log data are still under research.In this paper,we present an effective feature extraction approach by leveraging Byte Pair Encoding(BPE)and Term Frequency-Inverse Document Fre-quency(TF-IDF).We have applied this approach on various down-stream machine learning algorithms and proved its usefulness.

Web Log Data Analysis Features representation BPE TF-IDF Ma-chine Learning

Junlang Zhan Xuan Liao Yukun Bao Lu Gan Zhiwen Tan Mengxue Zhang Ruan He Jialiang Lu

Shanghai Jiao Tong University Shanghai,China Tencent Shenzhen,China

国际会议

2019国图灵大会(ACM Turing Celebration conference-China 2019 )

成都

英文

607-612

2019-05-17(万方平台首次上网日期,不代表论文的发表时间)