1

I have a requirement where I need to clip some rectangular part of OCRed pdf (Initially PDF was Scanned so we have perform OCR) into image.

I was not able to find any library which can achieve this. So I have splitted into two parts.

1. Clip Rectangular part from PDF using  iText. The result will be in PDF.
2. Convert clipped PDF into images using pdfBox.

But in the process of converting clipped PDF into images using pdfBox the result is not as expected. As for eg we are not able to get checkbox in JPEG image if the clipped pdf contain only checkbox.

I have searched in StackOverflow for all the possible solution but with no success.

My code is same as the solution provided by Tilman Hausherr here. Ihave also tried this

Is there any direct way to achivve the above two steps in one or get some better way to convert pdf to image.

Please don't mark it as duplicate as I am not able to get the solution even after too many search.

Roshan
  • 2,026
  • 2
  • 14
  • 25
  • 1
    *"As for eg we are not able to get checkbox in image if the clipped pdf contain only checkbox."* - That sounds like your "clipping" code only keeps page content, not annotations. Both iText and PDFBox can be used to create a clipped rectangle that keeps annotations, so you might want to present your pivotal clipping code for review. Additionally provide a sample PDF for which you observed the issue. – mkl Dec 01 '17 at 14:15
  • Try PDPage.setCropBox () – Tilman Hausherr Dec 01 '17 at 14:23

0 Answers0