I found the phrase "high-capacity cnn" in these two papers:
1.Rich feature hierarchies for accurate object detection and semantic segmentation
2.Region-based Convolutional Networks for Accurate Object Detection and Segmentation
I've searched it up on google but I can't seem to find a good one.