Abstract
The digital text written in an Indian script is difficult to use as such. This is because, there are a number of font formats available for typing, and these font-formats are not mutually compatible. Gurmukhi alone has more than 225 popular ASCII-based fonts whereas this figure is 180 in case of Devanagari. To read the text written in a particular font, that font is required to be installed on that system. This paper describes a language and font-detection system for Gurmukhi and Devanagari. It also explains a font conversion system for converting the ASCII based text into Unicode. Therefore, the proposed system works in two stages: the first stage suggests a statistical model for automatic language-detection (i.e., Gurmukhi or Devanagari) and font-detection; the second stage converts the detected text into Unicode as per font detection. Though we could not train our systems for some fonts due to non-availability of font converters but system and its architecture is open to accept any number of languages/fonts in the future. The existing system supports around 150 popular Gurmukhi font encodings and more than 100 popular Devanagari fonts. We have demonstrated the effectiveness of font detection is 99.6% and Unicode conversion is 100% in all the cases.
Cite
CITATION STYLE
Singh Lehal, G., Singh, T., & Kaur Buttar, S. P. (2014). Automatic Bilingual Legacy-Fonts Identification and Conversion System. Research in Computing Science, 86(1), 9–23. https://doi.org/10.13053/rcs-86-1-1
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.