pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rey malahay <reymala...@gmail.com>
Subject Re: hardly to found information about how to program with pdfbox
Date Sun, 02 Sep 2012 14:21:20 GMT
Hi xzz.

Try this to extract text at a certain page of a  pdf file:

1. Declare a new PDFTextStripper and PDFParser, i.e.

InputStream = new FileInputStream("PATH_TO_YOUR_PDF_FILE");

PDFTextStripper stripper = new PDFTextStripper();
> PDFParser parser = new PDFParser(stream);


2. Once you have set up your stream, pdf stripper and pdf parser, you are
ready to manipulate the contents of the pdf file:

stripper.setSortByPosition( false );
>


// start and end page can be the same if you just want one page.

stripper.setStartPage(the_page_where_parsing_starts);
> stripper.setEndPage((the_page_where_parsing_ends);
> parser.parse();
> stripper.getText(parser.getPDDocument());



I hope this helps. Let me know how this goes.

Thanks,
rey malahay


On 2 September 2012 04:02, xzz <19555230@qq.com> wrote:

> hi, I found it's really hard to found how to program with pdfbox. For a
> instance, how can i get the specific  content like text or graghic on
> certain page,  I can't found it in Tutorials or Cookbook and the Sample of
> ExtractTextByArea in Cookbook is just an api of it.  So is there anything
> further information about it.




-- 
My heroes are the ones who survived doing it wrong, who made mistakes, but
recovered from them. - Bono

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message