-3

I am doing a project on extracting data from a pdf file so can anyone let me know how I can extract all the data present in a pdf file.

mramosch
  • 392
  • 2
  • 14
vinod
  • 17
  • 2

1 Answers1

1

You might look into using PDFBox - http://pdfbox.apache.org/

It's open source java and can be used to extract content from documents.

tino
  • 34
  • 1
  • can you give a sample code how to implement pdfbox – vinod Mar 12 '14 at 13:12
  • Thanks a lot I have tried it but it taking more time to execute and I have a small doubt ... Can I know the font size using PDF box ? – vinod Mar 13 '14 at 03:06
  • Maybe this link will help - http://stackoverflow.com/questions/3203790/parsing-pdf-files-especially-with-tables-with-pdfbox – tino Mar 13 '14 at 15:41