0

im using Apache PDFBox,

I want to convert a RGB PDF file to another GRAYSCALE file WITHOUT using images method because its making huge file size -_- !!

so this my steps:

  1. Export a (A4) First.pdf from Adobe InDesign, contain images, texts, vector-objects.

  2. I read the First.pdf file. Done!

  3. using LayerUtility, copy pages from First.pdf rotate them and put them to NEW PDF file (A4) Second.pdf. Done!

    • this method preferred because i need vector-objects to reduce the size.
  4. then, i want to save this as GRAY-SCALE PDF file (Second-grayscale.pdf)

and this my code (not all):

PDDocument documentFirst = PDDocument.load("First.pdf"));

// Second.pdf its empty always
PDDocument documentSecond = PDDocument.load("Second.pdf"));

for (int page = 0; page < documentSecond.getNumberOfPages(); page++) {
    // get current page from documentSecond
    PDPage tempPage = documentSecond.getPage(page);

    // create content contentStream
    PDPageContentStream contentStream = new PDPageContentStream(documentSecond, tempPage);

    // create layerUtility
    LayerUtility layerUtility = new LayerUtility(documentSecond);

    // importPageAsForm from documentFirst
    PDFormXObject form = layerUtility.importPageAsForm(documentFirst, page);

    // saveGraphicsState
    contentStream.saveGraphicsState();

    // rotate the page
    Matrix matrix;
    matrix.rotate(Math.toRadians(90));
    contentStream.transform(matrix);

    // draw the rotated page from documentFirst to documentSecond
    contentStream.drawForm(form);

    contentStream.close();
}

// save the new document
documentSecond.save("Second.pdf");

documentSecond.close();
documentFirst.close();

// now convert it to GRAYSCALE or do it in the Loop above!

well, i just start using Apache Box this week, i have followed some example, but most are old and not working, until now i did what i need, just need the Grayscale :)!!

if there are other solutions in java using open-source library or a free tools. (i found with Ghost Script and Python)

i read this example but i didn't understand it and there are a functions deprecated!:

https://github.com/lencinhaus/pervads/blob/master/libs/pdfbox/src/java/org/apache/pdfbox/ConvertColorspace.java

its about PDF Specs, and changing Color Space...

nbdized
  • 105
  • 2
  • 11
  • The example works for simple documents only. But a complete solution is highly non-trivial. – mkl Oct 05 '18 at 19:08
  • can PDFBox do the GrayScale Conversion? – nbdized Oct 05 '18 at 21:48
  • Not "out of the box". – Tilman Hausherr Oct 06 '18 at 05:59
  • As far as I know there is no ready-to-use full-fledged pdfbox conversion routine, the example you reference covers only very simple cases. On the other hand pdfbox offers a framework for arbitrary pdf manipulations; thus, you can implement your own conversion based on pdfbox. So... *"can PDFBox do the GrayScale Conversion?"* - yes, but probably not out of the box. Add you don't share the pdf (or a representative example of the pdfs) in question, I cannot tell whether the example you found might already suffice. – mkl Oct 06 '18 at 07:06

1 Answers1

2

You mentioned you would be interested in a Ghostscript based solution as far as I understood. If you are able to call GS from your command line you can do color to grayscale conversion with this command line

gs -sDEVICE=pdfwrite -sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -dOverrideICC -o out.pdf -f input.pdf

my answer is taken from How to convert a PDF to grayscale from command line avoiding to be rasterized?

Hakan Usakli
  • 194
  • 1
  • 8