0

I needed to convert a PDF to image, and I'm using PDFbox for this, but I need the pictures, I mean, It doesn't exist some method like getImage(PDFImageWriter) (I'm supposing) or something like this? PDFbox works well , but I don't want to save everytime all the images locally, Just If I want(with a Button Save, for example), I will save . Is there a way to do with PDFbox or Should I change my way?

Thanks

user2556079
  • 564
  • 3
  • 7
  • 21
  • Does the answer by malaguna help? If not, please decide what your question is about, despite that "PDFBox works well": 1) save PDF to images, i.e. one image per page, 2) extract the images that are in a PDF, 3) build a random component so that PDFBox does "not save everytime", 4) "add a button save" (if so, explain to what application). – Tilman Hausherr Oct 14 '15 at 09:23
  • I'm working on his solution right now, give me time, I'm a beginner. However my question is focused on the first point you described above. In fact, @malaguna has understood immediately what I was looking for – user2556079 Oct 14 '15 at 09:57
  • 1
    If you just want to convert to images, a more simple answer is here: https://stackoverflow.com/questions/23326562/apache-pdfbox-convert-pdf-to-images – Tilman Hausherr Oct 14 '15 at 10:15
  • Why is this question unclear? I don't think it is. All people have answered me (@malaguna first ) according to what I was wondering, How can the question be unclear? – user2556079 Oct 14 '15 at 13:43
  • See my first comment why this is unclear. Btw I didn't flag the question or influence this or vote on it. Please don't see this as an attack - creating a good question is a skill that helps to "identify the problem(s)" and sometimes even solve it. I think you were seeing your whole application, instead of just concentrating on "atomic" problems first, i.e. "how do I convert a PDF page to an image with PDFBox" and "how do I save this image". Anyway, I'm happy that you've found your answer. – Tilman Hausherr Oct 14 '15 at 13:54
  • I'm sorry but I wasn't looking for someone that do the job for me, I was looking for a way to save the pictures without put locally in the PC. I already found how to convert each pdf page into an image.Your suggest help me a lot. But sorry again, you are totally wrong if you think I was looking for someone that do the job for me. Have a nice day – user2556079 Oct 14 '15 at 13:58
  • 1
    That isn't what I meant. Sorry if you had that impression. (Maybe it's a language misunderstanding by one of us) – Tilman Hausherr Oct 14 '15 at 14:01
  • @TilmanHausherr Ok, nice to hear. – user2556079 Oct 14 '15 at 14:06

1 Answers1

1

Here you have a method I used to get every page from a PDF file to images. This code uses PDFBox 1.8.8, and it calls a method to resize the resulting image, that it is not relevant, but if you want I could also give it to you.

File pdfFile = new File ...
PDDocument document = PDDocument.load(pdfFile);

if(document != null){
    @SuppressWarnings("unchecked")
    List<PDPage> pdfPages = document.getDocumentCatalog().getAllPages();

    if(pdfPages != null && !pdfPages.isEmpty()){
      for(PDPage page : pdfPages){
        BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 96);
        BufferedImage resized = resizeImage(image);

        File tmpFile = createTmpFile();
        writeImage(resized, tmpFile);

        pages.add(new Page(tmpFile.getAbsolutePath(), numOfColumns));
      }
    }
}

NOTE: This code has some custom methods that I explain below in answer editions.

The relevant part for you is how many or what page is of interest for you. Instead of a for block, you could select your desired page and convert it.

pages variable is part of my code, because every converted image was include into a object called pages.

Edit

I forgot the writeImage method, sorry:

private void writeImage(BufferedImage buffImage, File file) throws FileNotFoundException, IOException{
    Iterator<ImageWriter> iter = ImageIO.getImageWritersByFormatName("jpeg");
    ImageWriter writer = (ImageWriter)iter.next();

    ImageWriteParam iwp = writer.getDefaultWriteParam();
    iwp.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
    iwp.setCompressionQuality(0.95f);

    FileImageOutputStream fios = new FileImageOutputStream(file);
    IIOImage image = new IIOImage(buffImage, null, null);

    writer.setOutput(fios);
    writer.write(null, image, iwp);
    writer.dispose();
    fios.close();
}

Edit 2

As @user2556079 comments, there are one more own method (plus the one I indicated) I didn't comment, sorry, I clarify this methods here:

  • resizeImage is not relevant for the purpose of this question, it only resize original BufferedImage and return a new BufferredImage. This is because I wanted to get thumbnails of every page. It is not necesary if you want to get original page as an Image.
  • createTempFile this method creates a temp file using java.io. File.createTempFile(String, String) but using the FileCleaningTracker from Apache Commons. this way I have not to worry about temp file deletion.
malaguna
  • 4,020
  • 1
  • 15
  • 32
  • I got some error, 1)resizeImage is undefined 2)createTmpFile is undefined. Did you create these methods or I'm doing something wrong? – user2556079 Oct 14 '15 at 09:56
  • So, pages is part of your code, should I make too an object called pages that include each converted page? Or What? @malaguna – user2556079 Oct 14 '15 at 10:05
  • Both methods, `resizeImage` and `createTmpFile` are mine, but are not relevant, because the first one returns a `BufferedImage` and the seconds only manages the file creation, you can substitute the second by a `java.io. File.createTempFile(String, String)` call. I will edit answer to include this info. Give me a few minutes. – malaguna Oct 14 '15 at 10:15
  • Current version is 1.8.10. – Tilman Hausherr Oct 14 '15 at 10:17
  • numOfColumns doesn't seem to be relevant either – Tilman Hausherr Oct 14 '15 at 10:18
  • @TilmanHausherr I saw the other question, there's a deprecated method (writeimage) . Maybe it's easier but I don't know if it's the best solution – user2556079 Oct 14 '15 at 10:24
  • @Tilman Hausherr, I told that page was not relevant, thus neither numberOfColumns. – malaguna Oct 14 '15 at 10:27
  • @user2556079 The deprecation notice told what to use instead. Anyway, it has been updated. – Tilman Hausherr Oct 14 '15 at 10:28
  • @malaguna I think your solution could works too, but I found easier the solution in the other topic suggest me. Anyway , thanks. You have been very patient. – user2556079 Oct 14 '15 at 13:37