CGT code-based XML Data Compression Method
XML is a de-facto standard for exchanging and presenting information on the Web. However, XML data is also recognized as verbose since it heavily inflates the size of the data due to the repeated tags and structures. The data verbosity problem gives rise to many challenges of conventional query processing and data exchange. Compression techniques are the important way to overcome the verbosity problem. According to the features of XML document, we put forward a new XML data compression method called CGTXDC which uses XML Schema to construct XML document tree about the structure information of XML document and adopts CGT code to encode each tree node for maintaining the structure of the original XML document. CGTXDC requires only a single pass over the input XML document during the compression process and dont need to build the document tree in the memory. The experimental results show much better compression ratio than that of representative XML compression methods, such as Xpress and Xgrind.
XML document tree XML Schema CGT code data compression
Sheng Zhang Sha Chen Yuping Liang
School of Computer Science and Technology Nanchang Hangkong University Nanchang, Jiangxi Province 33 State-owned Assets Management Office Huanggang Normal University Huanggang, Hubei Province 438000, C
国际会议
Second International Symposium on Electronic Commerce and Security(第二届电子商务与安全国际研究大会)(ISECS 2009)
南昌
英文
1112-1115
2009-05-22(万方平台首次上网日期,不代表论文的发表时间)