I have built a simple YOLO localization model in Keras like,
model_layers = [
keras.layers.Conv2D( 32 , input_shape=( input_dim , input_dim , 3 ) , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 32 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.MaxPooling2D( pool_size=( 2 , 2 ) , strides=2 ),
keras.layers.Conv2D( 64 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 64 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.MaxPooling2D( pool_size=( 2 , 2 ) , strides=2 ),
keras.layers.Conv2D( 64 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 64 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.MaxPooling2D( pool_size=( 2 , 2 ) , strides=2 ),
keras.layers.Conv2D( 128 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 128 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 64 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 64 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 32 , kernel_size=( 3 , 3 ) , strides=1 , activation='relu' ),
keras.layers.Conv2D( 8 , kernel_size=( 3 , 3 ) , strides=1 ),
]
model = keras.models.Sequential( model_layers )
model.compile( loss=yolo_keras_loss , optimizer=keras.optimizers.Adam( lr=0.0001 ) )
model.summary()
As observed, the last layer's activation function is 'linear'.
But with regards to YOLO's output, all the values ( confidence score, bounding box coordinates and class probabilities ) are normalized. So should I use a sigmoid activation function or a linear activation function?
I cannot find the output layer's activation function in any of resources concerning YOLO.