会议专题

A Large-Scale Repository of Deterministic Regular Expression Patterns and Its Applications

  Deterministic regular expressions(DREs)have been used in a myriad of areas in data management.However,to the best of our knowledge,presently there has been no large-scale repository of DREs in the literature.In this paper,based on a large corpus of data that we harvested from the Web,we build a large-scale repository of DREs by first collecting a repository after analyzing determinism of the real data; and then further processing the data by using normalized DREs to construct a compact repository of DREs,called DRE pattern set.At last we use our DRE patterns as benchmark datasets in several algorithms that have lacked experiments on real DRE data before.Experimental results demonstrate the usefulness of the repository.

Deterministic regular expressions Repository Evaluation

Haiming Chen Yeting Li Chunmei Dong Xinyu Chu Xiaoying Mou Weidong Min

State Key Laboratory of Computer Science,ISCAS,Beijing 100190,China State Key Laboratory of Computer Science,ISCAS,Beijing 100190,China;University of Chinese Academy of School of Software,Nanchang University,Nanchang,China

国际会议

The 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining (第23届亚太知识发现和数据挖掘国际会议(PAKDD2019)

澳门

英文

249-261

2019-04-14(万方平台首次上网日期,不代表论文的发表时间)