Segmentation of Symbols

A recognizer can view symbols at any granularity. For instance, most handwriting recognizers see individual letters and numerals as symbols. A recognizer for cursive writing, on the other hand, may see a complete word as a single symbol without distinguishing each letter of the word.

No matter how it views symbols, a recognizer must separate them within a stream of written symbols, a process called segmentation. The task of segmenting letters is greatly facilitated if the application provides box guides. In this case, the recognizer can assume that strokes lying within a box constitute a single character. The problem of accurate segmentation becomes more difficult for unguided text.

Segmentation is a crucial issue for recognizing different handwriting styles. The following table lists the forms of input in decreasing order of constraint on the user. The information in the table is taken from IBM Research Report RC 11175, No. 50249, (May 21, 1985), An Adaptive System for Handwriting Recognition, by C. C. Tappert.

Input form
Boxed input
Each character appears within its own box.
Discrete spaced
A set of strokes in a given space belong to the same character. (This is also called external segmentation.)
Discrete run on
Printed characters can overlap.
Letters are connected by ligatures. The recognizer must either identify discrete letters or interpret a whole word at a time.
The recognizer can segment discrete, run-on, and cursive writing.

Figure 8.1 illustrates these various styles.

The Pen API places few restrictions on the recognizer. At a minimum, however, a default recognizer must be able to recognize discrete characters because many applications do not use boxed input.

Software for developers
Delphi Components
.Net Components
Software for Android Developers
More information resources
Unix Manual Pages
Delphi Examples
Databases for Amazon shops developers
Amazon Categories Database
Browse Nodes Database