Recently I had asked THIS QUESTION to be able to save all the images present in a PDF file on the File System and I was able to save the images successfully.
I tested my code on a lot of pdf files and it ran just fine. But, today I came accross THIS pdf file from where it is not able to extract some images(attached below).
Can anyone please tell me what else I can do to extract these images? Is it even possible to extract them? Are they really images or something else? I would really appreciate the help.
My code(Please ignore the hardcoding as I am still testing this out):
function fn_getAllImages()
{
var strPdf = "C:\\Users\\a614923\\Desktop\\haka\\Work\\2017\\10. October\\31\\test.PDF";
var strout = "C:\\Users\\a614923\\Desktop\\haka\\Work\\2017\\10. October\\31\\Newfolder\\img"
intPage = 2; //for the 2nd page(the image is present in the 2nd page)
var objPdf = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3(strPdf);
var objPage = objPdf.getDocumentCatalog().getAllPages().get(intPage-1);
var objImages = objPage.getResources().getXObjects().values().toArray();
var objImage, objImgBuffer, objImageFile;
for(var i=0; i<objImages.length; i++)
{
objImage = objImages.items(i);
Log.Message(objImage.toString());
if(aqString.Find(objImage.toString(),"PDXObjectForm",0,false)>0)
{
continue;
}
else
{
objImage.write2file_2(strout+i);
//objImgBuffer = objImage.getRGBImage();
//objImageFile = JavaClasses.java_io.File.newInstance(strout+i+".png");
//JavaClasses.javax_imageio.ImageIO.write(objImgBuffer,"png",objImageFile);
}
}
}
The image in the PDF file which I want to save(the one inside the red box below):