IJCATR Volume 4 Issue 2

Haralick Texture Features based Syriac(Assyrian) and English or Arabic documents Classification

Basima Z.Yacob
10.7753/IJCATR0402.1006
keywords : Syriac script; Haralick Texture Features; OCR; English script; Arabic script; knn algorithm; .

PDF
Script identification is very essential before running an individual OCR system. Automatic language script identification from document images facilitates many important applications such as sorting, transcription of multilingual documents and indexing of large collection of such images, or as a precursor to optical character recognition (OCR), in this paper the characterized are between Syriac and English documents or between Syriac and Arabic documents were the characterized is achieved by extracting Haralick texture Features. it is investigated a texture as a tool for determining the script of document image ,based on the observation that text has a distinct visual texture. Further, K nearest neighbour algorithm is used to classify 300 text blocks into one of the two scripts: Syriac, and English , or Syriac and Arabic based on Haralick texture Features . The script was inserted to the System with different rotation angles between 0º and 135º and the results of recognition were good.
@artical{b422015ijcatr04021006,
Title = "Haralick Texture Features based Syriac(Assyrian) and English or Arabic documents Classification",
Journal ="International Journal of Computer Applications Technology and Research(IJCATR)",
Volume = "4",
Issue ="2",
Pages ="120 - 123",
Year = "2015",
Authors ="Basima Z.Yacob"}
  • null