pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: PDFBox - Read pdf file line by line using C#.Net
Date Wed, 18 Feb 2009 09:33:13 GMT

On Mon, Feb 16, 2009 at 6:15 PM, Moshe Liaks <ajliaks@gmail.com> wrote:
> I use the code below to read a pdf file.
> The code is working fine. The problem is that I have to read the pdf
> line by line and not like "one big string".
> I have this need, because the text is a complex one, and I need to
> apply some filters while reading each line from the original.

You could subclass the PDFTextStripper class, and do your filtering in
the writeLineSeparator() method after buffering all the text on that


Jukka Zitting

View raw message