UCSC DL
 

UCSC Digital Library >
Computer Science Masters >
Master of Computer Science - 2017 >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/4026

Title: Sinhala Intelligent Word Recognition with Content based Search Suggest
Authors: Kahandagamage, K. S.
Issue Date: 2017
Abstract: Optical Character Recognition is a computer science approach to resolve offline character recognition problem. More advanced approaches like Intelligent Character Recognition and Intelligent Word Recognition are suitable to deal with unconstraint and cursive handwriting. Intelligent Word Recognition approach tries to recognize entire word than individual letters and good approach to process real world documents with unconstraint (free-form), cursive and incomplete handwriting. This research mainly focus on identifying multiline, unconstraint, cursive and incomplete Sinhala words in offline mode with higher accuracy. Identify word lines from scanned image and segment them into primitive components (character or its parts) are considered as prepro- cessing. Image processing methods are used to remove noise, remove frames and underlines, correct skews and slant which increase the accuracy of recognition. Context-free, analytical approach is used to yield a optimum letter string in recognition. Optimum letter string is retrieved by classifying gradient features of a character. 8 directions are considered for feature extraction. Search suggest algorithm with Ayurveda content based corpus is used in post processing to identify words. Natural language processing methods are used to match words by correcting misspelled and incomplete words. Scope of the research is limited to Ayurveda domain but can be extended to any other domain by simply plugging a specific corpus. Prescriptions written by Sinhala PaaramparikaWeda-Mahathwaru are used to validate system accuracy and achieved 62%.
URI: http://hdl.handle.net/123456789/4026
Appears in Collections:Master of Computer Science - 2017

Files in This Item:

File Description SizeFormat
14440387.pdf5.37 MBAdobe PDFView/Open

Public View:

File Preview
14440387.pdf

View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback