Segmentation of Symbols
A recognizer can view symbols at any granularity. For instance, most
handwriting recognizers see individual letters and numerals as symbols. A recognizer for
cursive writing, on the other hand, may see a complete word as a single symbol
without distinguishing each letter of the word.
No matter how it views symbols, a recognizer must separate them within a
stream of written symbols, a process called segmentation
. The task of segmenting letters is greatly facilitated if the application
provides box guides. In this case, the recognizer can assume that strokes lying
within a box constitute a single character. The problem of accurate segmentation
becomes more difficult for unguided text.
Segmentation is a crucial issue for recognizing different handwriting styles.
The following table lists the forms of input in decreasing order of constraint
on the user. The information in the table is taken from IBM Research Report RC
11175, No. 50249, (May 21, 1985), An Adaptive System for Handwriting Recognition
, by C. C. Tappert.
||Each character appears within its own box.
||A set of strokes in a given space belong to the same character. (This is also
called external segmentation.)
|Discrete run on
||Printed characters can overlap.
||Letters are connected by ligatures. The recognizer must either identify
discrete letters or interpret a whole word at a time.
||The recognizer can segment discrete, run-on, and cursive writing.
Figure 8.1 illustrates these various styles.
The Pen API places few restrictions on the recognizer. At a minimum, however,
a default recognizer must be able to recognize discrete characters because many
applications do not use boxed input.
- Software for developers
Software for Android Developers
- More information resources
Unix Manual Pages