Video Script Identification based on Text Lines
In this paper, we present a new method for video script identification which is essential before choosing an appropriate OCR engine for identifying text lines when a video frame contains more than one language. The input for script identification is the text lines obtained by our text detection method. We extract upper and lower extreme points for each connected component of Canny edges of text lines. The extracted points are connected to study the behavior of upper and lower lines. The direction of each 10-pixel segment of the lines is determined using PCA. The average angle of the segments of the upper and lower lines is computed to study the smoothness and cursiveness of the lines. In addition, to discriminate the scripts accurately, the method divides a text line into five equal zones horizontally to study the smoothness and cursiveness of the upper and lower lines of each zone. We evaluate the method by conducting experiments on different combinations of languages such as English and Chinese, English and Tamil, Chinese and Tamil, and English, Chinese and Tamil.
Trung Quy Phan Palaiahnakote Shivakumara Zhang Ding Shijian Lu Chew Lim Tan
School of Computing, National University of Singapore, Singapore Institute for Infocomm Research, Singapore
国际会议
北京
英文
1240-1244
2011-09-01(万方平台首次上网日期,不代表论文的发表时间)