Thursday, August 6, 2009

Activity 12: Color Image Segmentation

Segmentation by thresholding will not always work especially when the area of interest has the same grayscale value as the background. In such cases, segmentation can be done using the color of the image of interest. In this activity, we're going to demonstrate parametric and non-parametric color segmentation.

Normalize Chromaticity Coordinates (NCC)
Before we proceed with color segmentation, it is convenient to convert the RGB space into NCC space. NCC has the advantage of separating the chromaticity and intensity of the image which is ideal for representing 3D objects. Essentially, a 3D object obtains its 3D effect through shading variations.
To convert from an RGB space to NCC space we do the following: for each pixel,
Note that the chromaticity of the image is essentially reduced to two dimensions since b can be obtained from r and g usng b=1-(r+g). The image below represents the normalize chromaticity coordinates where the x-axis represents the r and y-axis represents the g.
Normalize chromaticity coordinate.

Color Segmentation: Parametric vs Non-parametric
To segment the image using its color, we must determine the probability that a pixel belongs into the color of interest. To do this, we need to extract the histogram of the color of interest and normalize it to the number of pixels to obtain the normalized PDF. In the NCC space, chromaticity is reduced to two dimensions, r and g such that we need to obtain the PDF for each space. If we assume that r and g are independent, we can obtain the PDF for each space separately and multiply them to obtain the overall PDF of the image.

Parametric Segmentation
Parametric segmentation segments the color by fitting a PDF in the image. Assuming a Gaussian distribution independently along r and g axis, we can obtain the PDF by using the equation below: mu is the mean value and sigma is the standard deviation of the color of interest, r is the NCC r-space of the image.Essentially, what this PDF does is to highlight all those pixels having pixel value near the mean and darkens those pixel values that deviates from the mean. Similarly we can obtain a similar PDF for the NCC-g space. The joint probability is obtained by multiplying p(r) and p(g). This PDF then contains all the highlighted pixel values near the desired color of interest.

Nonparametric Segmentation
The drawback of parametric segmentation is that it assumes a PDF independent of the pixel values of the image. In non-parametric segmentation, we base the segmentation base on the histogram of the color of interest itself and use this to backproject it to the image. Essentially, we obtain the 2D histogram of the color of interest by first converting it to NCC space. The result of this histogram then is a 2D matrix with the r-axis (representing the x-axis), g-axis (representing the y-axis) and the intensity (representing the count, frequency). For each pixel in the image, we backproject the obtained histogram. The steps are as follows:
  • obtain histogram of the color of interest, convert to NCC space first
  • for each pixel in the image, obtain its NCC r and g values
  • using the r and g values, find the pixel value from the histogram
  • replaec the pixel value of the image with the value obtained in the histogram
Below are reconstructed images of parametric and non-parametric segmentation.
Parametric reconstruction, observe that the segmentation is not perfect and contains two or more colors.

Non-parametric segmentation. The images are segmented quite well except for the brown and yellow hounds. Image was taken from source [2]

Parametric reconstruction. Note that for the gren shoe, artifacts of blue shoe are present in the segmentation. Image was taken from source [3].

Non-parametric segmentation. Note that the segmentation is clearer as compared to the parametric segmentation.

In general, non-parametric segmentation has better results in that it was able to segment the colored objects solely and cleanly as compared to the parametric segmentation. This is expected since in the paramtric segmentation, we assumed a Gaussian PDF whereas in the non-parametric segmentation, we base our segmentation solely from the image itself. It must be noted however that non-parametric segmentation highly depends on the cropped color of interest. For this reason, it is probably convenient to apply parametric segmetation for those that require automatic color segmentation.

In this activity, I give myself a grade of 10 for segmenting the images properly.

Acknowledgement
I would like to acknowledge jaya for useful discussions regarding the non-parametric segmentation

References
[1] App Physics 186 activity 12 manual
[2] www.greytsoaps.com/greys.htm
[3] http://www.whatsalltheracquet.com/archives/pictures/renelacoste.jpg

Activity 11: Color Image Processing

When imaging using a detector (i.e., a camera), the amount of light detected depends on the sensitivity of the detector, the intensity of the illuminating light and reflectance properties of the object. Similarly in digital imaging, a pixel is compose of different proportions of red, green and blue values overlaid in different proportions. The equation below describes how the RGB values are obtained. It is an integral product of the reflectance spectra of the object, spectral power distribution of the light source and camera sensitivity.
The presence of the K values above normalizes the effect of camera sensitivity to obtain what we call white balancing. Note that if K is remove, cameras sensitive to certain spectral range will affect the overall color of the image. Also, the kind of illumination will affect the color of the object, i.e., incandescent light appears yellowish as compared to fluorescent light.

In this activity, we're going to apply two algorithms to achieve white balancing. They are the White Patch Algorithm and Gray World Algorithm.

White Patch Algorithm
Observe from equation above that the K's are essentially the inverse of the camera output when shown with a white object. Thus white balancing here is just dividing the raw camera output by the image of the white object. This is precisely the White Patch Algorithm.

Gray World Algorithm

In the gray world algorithm, it assumes that the average color of the world is gray. Taking the average of the red, green and blue channels can then serve as white balancing constants for the red, green and blue channels respectively. For a certain channel, when you divide the pixel values by the average value of that channel, it is probable that many pixels would contain values greater than 1 (i.e., if pixel value > ave. pixel value). By normalization, (divide by the max) we can lower this value to 1. However, this has the overall effect of darkening the image. As compared to the white patch which divides each channel by the pixel value of the 'white', most probably contains the highest pixel value of the image.

In summary, the algorithm is as follows
  • read the image
  • obtain the constants for white balancing
    • white patch algorithm - obtain a white patch in the image and average the pixel values of this patch for each channel. This will serve as the white balancing constants
    • for the gray world - obtain the average of each channel and use as the white balancing constants
    • note that before we do the averaging, we remove all pixels that are saturated.
  • divide each channel by the corresponding white balance constants
  • normalize the image by dividing the image by its maximum. Note that we avoided the use of clipping the image for in the gray algorithm, it is probable that for each channel many pixel values are greater than the average pixel value making the image look saturated when clipped.
Below are images obtained from different lighting conditions and corrected using the white patch algorithm and the gray world algorithm. It is arrange such that each column represents the white balancing used by the camera while for each row is the corresponding algorithm used. Raw means without the use of the white and gray algorithm.

The picture below was taken around 9AM in the CSRC garden. (Cloudy condition)
From the above image, the raw, white patch and gray algorithm has generally the same color hue but the gray patch has darker image compared to the white patch and raw. Generally, the raw image is still better since this image was taken using the correct white balancing of the camera.
The above picture has wrong white balancing set by the camera. The gray world algorithm was able to correct the image, adjusting the hue of the red and blue channels. The same can be said of the white patch algorithm. However, the image of the gray world algorithm appears darker as compared to the white patch. Again we attribute this to the constant we used in dividing the pixel values of the image and by normalization (see discussion above).

We tried the algorithm for different lighting conditions and the results are seen below.

Image of a table in the CSRC lobby (daylight)
Note that the white and gray algorithm was able to correct the "light blue hue" background of the raw image.
Note that the raw image has yellowish hue which was corrected by the gray and white algorithm.


Another image taken inside the room (i.e., fluorescent lighting condition)
Note that the result of the gray world algorithm and white patch algorithm brings out the color of the table better compared to the raw image.
In this picture, we can see that gray world algorithm is better compared to the white patch algorithm. In the raw picture, the image has yellowish hue background indicating that some channel is dominant over the other channels. Note that using the gray world algorithm, we are obtaining the average of each channel reducing the effect of dominance of other channels.


Another image inside the room of different hues of red. (fluorescent lighting conditions)
Observe the corrected gray and white algorithm using incandescent white balancing of the camera.

Generally, I think that the gray world algorithm is much convenient and sometimes better compared to the white patch algorithm. Note that unless, we can obtain an algorithm for finding the white patch image in the white patch algorithm, our reconstruction would always depend on the patch that we're going to use. As compared to the gray world algorithm which makes use of averaging. The drawback of gray world algortithm is that it makes the image look darker. However, this can be remedied by using contrast enhancement, i.e., use an exponential cumulative distribution function to lighten up the image.

In this activity, I give myself a grade of 10, for obtaining the reconstructions and explaining the results

Acknowledgement
I would like to acknowledge Kaye for lending me the McDo stuff toy and Carmen for the blue notebook.

References
[1] App Physics 186 Activity 11 Manual

Activity 10: Preprocesing Text

When handwriting documents, usually the papers we are using are embedded with lines. In this activity, we are going to extract handwritten text from these kinds of papers. We're going to apply different image processing techniques to remove the lines and extract the text.
Scanned image of a piece of paper with handwritten texts.

We crop a portion of the above image and try to segment the handwritten texts individually.
The image was rotated using mogrify at an angle ~1.219. Click to enlarge the image

The lines can be removed by filtering in Fourier space by noting that the horizontal lines produce frequencies along the vertical axis in frequency space.
Filtering in Fourier space. Click to enlarge the image

The filtered image was further polish using scilab's sharpen and enhance (i.e., mogrify(img, ['-sharpen','2'])to highlight the letters better. Enhance is applied to remove the noise which will be useful when thresholing the imge.
Sharpening and enhancing the image for thresholding. Click to enlarge the image

Note that after thresholing, the image is noisy. We apply closing followed by opening operator to clean the image and segment the letters as much as possible.
From leftmost: (a) structuring element (b) result of closing operator and (c) result of opening operator. Click to enlarge the image

We use scilab's built-in thin function to segment the image into 1 pixel width. Scilab's bwlabel was applied to separate each continuous blobs which represent the segmented letters.
Segmented letters using bwlabel and thin. Click to enlarge the image

Note that although we are able to separate the letters into 1 pixel width, the segmentation is not perfect and needs lot of improvement.

Recommendation:
For the case of the image above, the handwritten letters were written using blue colored pen. Using color segmentation, we can separate the handwritten texts without having to worry about filtering. Morphological operations can then be used to segment the image better. It must be noted that most of the error in the above segmentation is due to thresholding after filtering the image. Segmentation by color might improve the reconstruction

For the next part of this activity, we find other instances of the word description in the original image. We simply correlate the original image with the image of the word description and find the locations where the pixel value is highest.
The images that we're going to correlate. (a) binarize image and (b) image of the word DESCRIPTION.
Result of the correlation. The highlighted pixels correspond to the word DESCRIPTION.The bright dots correspond to high correlation. Click to zoom in

In this activity, I give myself a grade of 8 for below average reconstruction. However, I recommended an alternate solution to the problem and for this reason I give myself a grade of 8.5 (bonus of 0.5) =).

References
[1] App Physics 186 activity 10 manual

Saturday, August 1, 2009

Activity 9: Binary Operations

In this activity, we're going to obtain the best estimate of cell area (pixel count) by using all the morphological and binary operations we have learned. Below is the image of the "simulated" cells that we're going to measure, punched papers imaged using a flatbed scanner.
Image of simulated cells, punched paper digitize using a scanner.

A quick look at the image tells us that we need to analyze several ROI (region of interest) in order to obtain the best estimate. Ofcourse, we could opt to obtain an isolated cell and perform area measurement but this will not always apply especially when the cells have areas that deviate slightly from one another. Also, the cells shape may not always be uniform. In order to obtain our region of interest, we could do thresholding and binarized our image for easier area measurements later on.
Below is the histogram of our image:
Histogram of the simulated cells.

We can see that most of the information in our image is roughly between 0.5 and 0.85. Base on this value, we could binarize our image using a treshold of around 0.85.

Statistically, if we can obtain several measurements of different cells, we can plot the histogram of our measurement and base from this histogram, we can obtain our best estimate for area measurement. This is precisely what we're going to do. We obtain several subimages (256 x256) from the original image and perform area measurement on each image. We do this by thresholding the image and perform morphological operations like opening and closing to separate nearly touching cells and to remove isolated spots. A literature on opening and closing operation is available in source [2].

In this activity, we obtain 20 subimages of sizes (256x256) randomly on the image.
Subimages of size 256 x 256.
The above image was thresholded at around 0.820 resulting to the image below:
Thresholded subimages.

Observe the presence of nearly touching cells and isolated spots on the subimages. We can futher clean this image by performing morphological operations. In this activity, we perform opening operation on the above image using a circle as structuring element having diameter = 8 pixels. Note that we have to be careful when choosing our structuring element for this could affect the sizes of the cells drastically.

Below is the resulting image after the opening operation.
Comparing to the "unopened image" above, we can see that the images are relatively clean and we are able to separate several nearly touching cells. Although, this separation is not 100% successful, we are somehow able to enhance our image for area measurement.

After performing morphological operations on our image, we are now in the position to obtain area measurement by labeling all contiguous blobs and couting their areas (pixel count). We can do this in scilab using the function bwlabel which labels all contiguous blobs and tabulates them in an array. Below is the resulting images after applying bwlabel.
Labeled subimages using scilabs bwlabel.

By looping on all the subimages and counting the area of each contiguous blobs, we can obtain a frequency distribution of the area of the cell. In our measurement, we disregarded area measurements greater than 800, since by visual investigation, the cells' pixel area is clearly below this value. Below is the histogram of the obtained cell areas.

Histogram of cell measurement

From the above histogram, we obtain an estimate of area = 530 pixels plus or minus 10 pixels since we use bins of spacing 20 pixels in our histogram. In the above histogram the number of cells having areas between 520-540 is 87, significantly higher than the other measured area.

Since our approach is statistical, the higher the number of subimages, the better. Although, we could perform morphological operations on nearly touching cells, it must be stressed that the approximation is much easier if we can obtain images of cells that are relatively sparse. We must think of this when we obtain images of cells in our experiment.

In this activity, I give myself a grade of 10 for performing the required measurement.

Acknowledgement
I would like to thank Irene for very useful conversations.

References
[1] AP 186 Activity 9 manual
[2] http://en.wikipedia.org/wiki/Mathematical_morphology#Opening_and_Closing