A simple example of feedforward neural network and image recognition

Introduction

The new field, in which I’ve stepped a little bit is image processing or image recognition. This field has some similarities with the previous field (signal processing) and so it was not so difficult to try something new.

Today, we’re going to review a simple application, which I’ve written for my lab as a bonus for the final mark. The objection of the application is to recognize the areas in the picture, which has the most trees (the greenest ones). All the code can be downloaded here, especially, if you’re good in analyzing codes, the further description of this method is not necessarily to read in order to understand how everything works.

Method

Firstly, we take a picture for training. Originally, images are 1000×1000 pixel size, but because I’m poor and my computer does not have lots of power and memory capacity, the image size is scaled by 0.40. The image size after the scale is 400×400 pixels.

Sample image

Next step is to use a sliding window (not sliding window protocol) to analyse the image. Basically, it is the same sliding window method, as for one dimensional signal, but this time, the samples are taken in two dimensions. The matlab implementation is extremely easy:

for i=2:size(I, 1)-1
 clear j;
 for j=2:size(I, 2)-1
   Id(k,:) = I(j-1:j+1,i-1:i+1,:);
   k = k + 1;
 end
end

The code illustrates the use of 3×3 sliding window. The variable I is the image itself and because it is an RGB image, the selected window has three dimensions. The most important information from the image for the forest discovery problem is the green color, so the identification can be done by this feature. What we can do is not to save all RGB information of the window, but just green color information:

for i=2:size(I, 1)-1
  clear j;
  for j=2:size(I, 2)-1
    Id(k) = mean(mean(I(j-1:j+1,i-1:i+1,2)))/9;
    k = k + 1;
  end
end

Using this code, we can work only with one dimensional feature data, not 27 dimensional data (as would be, using all RGB information).

Next problem is the uncertainty then the window sees the forest and then it is something else. Doing this by hand would be some sort of suicidal work, which could take ages to complete.. But there is a solution! It’s called the k-means clustering algorithm. What KM does, it divides the data by means. Logically, the green color mean of the forest area would be bigger.

It matlab it is realized quite simply:

IDX = kmeans(Id,2);

And this is it. The function returns the labels of the data of KM clusters.

The KM algorithm is sometimes a little bit random, for example – first time, doing KM it can label most green zones as 1, and not so green zones as 2. Do it again and the labels would be turned around. How well KM performed the clustering solution can be tested only after classification, then the `forest` would be recognized. If you see, that the algorithm does not work as good as it suppose to – you need to run it again, to give KM algorithm one more chance (okay, maybe not only one chance, but it is still better, than manually labeling data).

The next step, after KM is to separate original data, which we will need to train our Artificial Neural Network (FeedForward Neural Network (FFNN)), to do classification.

P1 = Id(IDX==1,:);
P2 = Id(IDX==2,:);
number = min(size(P1), size(P2));
number = number(1);

Here, P1 has one cluster of data, and P2 has the other. We’re searching with data cluster has less data, because the training algorithm require, that number of data samples in training set of both data clusters would be the same.

Next step is to train our super-awesome (and slow) FFNN with 10 hidden neurons (the number of neurons was chosen just randomly):

net = feedforwardnet(10);
P = [ P1(1:number,:); P2(1:number,:) ]';
T = [ zeros(1,number) ones(1,number) ];
net = train(net, P, T);

This can take some time.. The final algorithm does training using 4 different pictures of pictures. The classification tests was done using other 15 pictures. Here is some examples of classification done by this algorithm:

Acceptable classification result

Not acceptable classification result

On the left side of illustrations, there is an original image, in the middle there is classification result, and on the right side the mapped data. As we can see, the classification sometimes is done quite good, but sometimes is fails completely.

This algorithm requires more work (and more powerful computer to run on), but as a starting point it is quite good. The possible features to extract may be also mean of yellow color (to classify out roads).

Conclusion

In this article, a simple image processing example is shown and discussed. As a feature to recognize the forest and not forest areas, the mean of green color was used. No dimensional reduction algorithm was used, because only one dimension is used. The data labeling was done using k-means clustering algorithm, which showed quite good results, but it is not recommended to use it for some very important tasks. This time it was more for fun, than for production. The classification was done, using FeedForward Neural Network (FFNN).

The complete code can be downloaded here. Just extract and run `lab_10`. After some time, you must see the same images, as shown in this short example.