Article ID: | iaor19982953 |
Country: | United States |
Volume: | 27 |
Issue: | 10 |
Start Page Number: | 1345 |
End Page Number: | 1363 |
Publication Date: | Oct 1994 |
Journal: | Pattern Recognition |
Authors: | Bose C.B., Kuo S.S. |
Keywords: | programming: dynamic |
We have applied a Hidden Markov Model (HMM) and level-building dynamic programming algorithm to the problem of robust machine recognition of connected and degraded characters forming words in a poorly printed text. The recognition system consists of preprocessing, subcharacter segmentation and feature extraction, followed by a supervised learning or recognition. A structural analysis algorithm is used to segment a word into subcharacter segments irrespective of the character boundaries, and to identify the primitive features in each segment such as strokes and arcs. The states of the HMM for each character are statistically represented by the subcharacter segments, and the state characteristics are obtained by determining the state probability functions based on the training samples. In order to recognize an unknown word, subcharacter segmentation and feature extraction are performed and the transition probabilities between character models are used for the transition between characters in the string. A level-building dynamic programming algorithm combines segmentation and recognition of the word in one operation and chooses the best probable grouping of characters for recognition of an unknown word. The computer experiments demonstrate the robustness and effectiveness of the new system for recognizing words formed by degraded and connected characters.