optimal numbers of layers and resolution depend on dataset.
The smaller objects - the higher resolution is required.
The large objects - the more layers are required. There is an article on choosing the optimal number of layers, filters and resolution for MS COCO dataset: https://arxiv.org/pdf/1911.09070.pdf
It depends on what accuracy and speed do you want. To reduce rewritten_bbox % just increase resolution and/or move some masks from [yolo] layers with low resolution, the [yolo] layers with higher resolution, and train. Also iou_thresh=1 may reduce rewritten_bbox %