会议专题

Very High Accuracy and Fast Dependency Parsing is not a Contradiction

In addition to a high accuracy, short parsing and training times are the most important properties of a parser. However, parsing and training times are still relatively long. To determine why, we analyzed the time usage of a dependency parser. We illustrate that the mapping of the features onto their weights in the support vector machine is the major factor in time complexity. To resolve this problem, we implemented the passive-aggressive perceptron algorithm as a Hash Kernel. The Hash Kernel substantially improves the parsing times and takes into account the features of negative examples built during the training. This has lead to a higher accuracy. We could further increase the parsing and training speed with a parallel feature extraction and a parallel parsing algorithm. We are convinced that the Hash Kernel and the parallelization can be applied successful to other NLP applications as well such as transition based dependency parsers, phrase structrue parsers, and machine translation.

Bernd Bohnet

University of Stuttgart Institut f¨ur Maschinelle Sprachverarbeitung

国际会议

The 23rd International Conference on Computational Linguistics(第23届国际计算语言学大会)

北京

英文

89-97

2010-08-01(万方平台首次上网日期,不代表论文的发表时间)