4

I was trying to create a convolution neural network for the recognition of animals, vehicles, buildings, trees, plants from a large data-set having the combination of these objects.

At the time of training I got a doubt about the way in which the network should be trained. My doubt is that whether I could train the network with the data-set of whole animals as a single attribute or train each animals separately?

Means, one group for lions, one for tigers, one for elephants etc and at the time of testing I can code it to output the result as animal if any one of its subcategory is satisfied.

I got this doubt since I have read that there should be a correct pattern in the data-set for the efficient detection and there should be a pattern only if we are training with the subcategory of objects than the vast data-set.

I have attached a figure showing the sample dataset(only logically correct). I want to know whether there should be separate data-set or single data-set.

sample image

Arun Sooraj
  • 687
  • 9
  • 20
  • The answer is completely dependent on your use case - do you plan to only identify a generic label such as "animal" or labels such as "lion"/ "tiger". This is true for any algorithm you are applying for this problem i.e using CNN does not make a difference here. – shekkizh Jun 21 '16 at 22:23
  • It means, the convolution neural network can find out the similarities in the data-set (even they have minimal similarity) and could recognize new data coming for testing, doesn't it? – Arun Sooraj Jun 22 '16 at 04:17
  • 1
    Yes. CNN will be able to find high level features that can identify the classes with enough data and proper training - you want to define your network such that it generalizes well. – shekkizh Jun 22 '16 at 05:45
  • ok... Thank you Shekkizh... – Arun Sooraj Jun 22 '16 at 05:55

1 Answers1

2

Training on a separate data-set or a single data-set will depend on a variety of factors. If you want to classify the images in your test dataset using the Convolution Neural Network into just animals and not further subdivide them, then training on a single-data should be done. However, if you plan to further sub classify the images into tigers and lions, then the training needs to be done on separate datasets of tigers and lions.

The type of the dataset that you use for training will highly depend on your requirements of classification on the test dataset.

Moreover, you have to make sure that you normalize the images before you use it for training.

Aditya
  • 874
  • 9
  • 26
  • Thank you Aditya. I got the answer for my doubt. so, in my "animal" data-set, I have suppose 100 lions,100 tigers and 100 elephants. It is trained and whenever a new lion or tiger or elephant comes, my network can identify it as "animal", can't it? i don't want to recognize separately, I need to classify those as just "animals". – Arun Sooraj Jun 22 '16 at 04:15
  • Welcome Arun. Yes absolutely, the test data can be identified as an animal provided the class label you provide while training lions, tigers and elephants is animals. I worked with a similar problem before where I was training my network on different types of sounds of the same type. e.g different types of whistles all labelled as whistles. etc. – Aditya Jun 22 '16 at 04:29