Thursday, August 6, 2009

Activity 10: Preprocesing Text

When handwriting documents, usually the papers we are using are embedded with lines. In this activity, we are going to extract handwritten text from these kinds of papers. We're going to apply different image processing techniques to remove the lines and extract the text.
Scanned image of a piece of paper with handwritten texts.

We crop a portion of the above image and try to segment the handwritten texts individually.
The image was rotated using mogrify at an angle ~1.219. Click to enlarge the image

The lines can be removed by filtering in Fourier space by noting that the horizontal lines produce frequencies along the vertical axis in frequency space.
Filtering in Fourier space. Click to enlarge the image

The filtered image was further polish using scilab's sharpen and enhance (i.e., mogrify(img, ['-sharpen','2'])to highlight the letters better. Enhance is applied to remove the noise which will be useful when thresholing the imge.
Sharpening and enhancing the image for thresholding. Click to enlarge the image

Note that after thresholing, the image is noisy. We apply closing followed by opening operator to clean the image and segment the letters as much as possible.
From leftmost: (a) structuring element (b) result of closing operator and (c) result of opening operator. Click to enlarge the image

We use scilab's built-in thin function to segment the image into 1 pixel width. Scilab's bwlabel was applied to separate each continuous blobs which represent the segmented letters.
Segmented letters using bwlabel and thin. Click to enlarge the image

Note that although we are able to separate the letters into 1 pixel width, the segmentation is not perfect and needs lot of improvement.

Recommendation:
For the case of the image above, the handwritten letters were written using blue colored pen. Using color segmentation, we can separate the handwritten texts without having to worry about filtering. Morphological operations can then be used to segment the image better. It must be noted that most of the error in the above segmentation is due to thresholding after filtering the image. Segmentation by color might improve the reconstruction

For the next part of this activity, we find other instances of the word description in the original image. We simply correlate the original image with the image of the word description and find the locations where the pixel value is highest.
The images that we're going to correlate. (a) binarize image and (b) image of the word DESCRIPTION.
Result of the correlation. The highlighted pixels correspond to the word DESCRIPTION.The bright dots correspond to high correlation. Click to zoom in

In this activity, I give myself a grade of 8 for below average reconstruction. However, I recommended an alternate solution to the problem and for this reason I give myself a grade of 8.5 (bonus of 0.5) =).

References
[1] App Physics 186 activity 10 manual

No comments:

Post a Comment