pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rey malahay <reymala...@gmail.com>
Subject Re: hardly to found information about how to program with pdfbox
Date Sun, 02 Sep 2012 14:21:20 GMT
Hi xzz.

Try this to extract text at a certain page of a  pdf file:

1. Declare a new PDFTextStripper and PDFParser, i.e.

InputStream = new FileInputStream("PATH_TO_YOUR_PDF_FILE");

PDFTextStripper stripper = new PDFTextStripper();
> PDFParser parser = new PDFParser(stream);

2. Once you have set up your stream, pdf stripper and pdf parser, you are
ready to manipulate the contents of the pdf file:

stripper.setSortByPosition( false );

// start and end page can be the same if you just want one page.

> stripper.setEndPage((the_page_where_parsing_ends);
> parser.parse();
> stripper.getText(parser.getPDDocument());

I hope this helps. Let me know how this goes.

rey malahay

On 2 September 2012 04:02, xzz <19555230@qq.com> wrote:

> hi, I found it's really hard to found how to program with pdfbox. For a
> instance, how can i get the specific  content like text or graghic on
> certain page,  I can't found it in Tutorials or Cookbook and the Sample of
> ExtractTextByArea in Cookbook is just an api of it.  So is there anything
> further information about it.

My heroes are the ones who survived doing it wrong, who made mistakes, but
recovered from them. - Bono

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message