4

I'm using YOLOv3 and YOLOv3-Tiny from AlexeyAB's fork of Darknet. I understand that the image size must be a multiple of 32. And that batch divided by subdivisions determines the number of images that will be processed in parallel.

For example, the batch size in the default yolov3.cfg file is 64, and subdivision is 16, meaning 4 images will be loaded at once, and it will take 16 of these mini batches to complete one iteration.

What I don't see documented in the wiki:

Are there restrictions on these values? Do they need to be a multiple of 16? Power of 2? Can I have batch=25 and subdivisions=5?

Stéphane
  • 17,613
  • 22
  • 82
  • 117

1 Answers1

2

I believe it is not a must to be a power of 2, the important thing is that batch must be divisible by subdivisions as the code uses small batches of batch / subdivisions as you can see in parcer.c:

net->batch /= subdivs;

then the number of images processed in every step is defines as in detector.c:

int imgs = net.batch * net.subdivisions * ngpus;

Although the defined BLOCK in dark_cuda.h is 512, the used num_blocks in the kernels doesn't have to be divisible by 2 as can be seen in dark_cuda.c:

int get_number_of_blocks(int array_size, int block_size)
{
    return array_size / block_size + ((array_size % block_size > 0) ? 1 : 0);
}

I think the only problem could be a performance issue as CUDA runs in wraps of 32, so any number not a multiple of 2 may cause part of the used memory to not be fully utilized.

However, I recommend that you try training your network with these parameters to confirm that it works as desired.

AbdelAziz AbdelLatef
  • 2,989
  • 6
  • 17
  • 38