pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A.M. Sabuncu" <amsabu...@gmail.com>
Subject Re: Token IH?
Date Mon, 05 Jan 2015 11:25:18 GMT
Here's the screenshot of a PDF (an issue of The New Yorker magazine) opened
using Vim.  As you can see, the only legible text are couple of PDF
keywords (endstream, endobj).

http://imgur.com/OqKL2D1

On Mon, Jan 5, 2015 at 1:12 PM, Milan Tomic <tomicmilan@yahoo.com.invalid>
wrote:

> Hi,
> Acctually, the only reason I am using PDFBox is that I would like to find
> and replace some text. Is it possible (using Java) to open some PDF and do
> find/replace in some other (simple) way? Let's say I open some PDF as a
> file/stream and search/replace text (strings). Would it work or not?
>
> Best regards,Milan
>
>      On Monday, January 5, 2015 9:26 AM, Milan Tomic
> <tomicmilan@yahoo.com.INVALID> wrote:
>
>
>  Hi Tilman,
> 1) Yes I am using latest 1.8.8 version of PDFBox2) I have uploaded PDF
> file here:
>
> https://drive.google.com/file/d/0B1ON9cHcd9I2WVM1d0Y5QW9oNHc/view?usp=sharing
> Which PDF Explorer would you recommend so that even I could view this PDF
> internal structure?
> Thank you in advance for your help,Milan Tomic
>
>
>     On Wednesday, December 31, 2014 8:17 PM, Tilman Hausherr <
> THausherr@t-online.de> wrote:
>
>
>  >        From: Tilman Hausherr <THausherr@t-online.de>
> >  To: users@pdfbox.apache.org
> >  Sent: Wednesday, December 31, 2014 6:17 AM
> >  Subject: Re: Token IH?
> >
> > Hi,
> >
> > 1) What version are you using? If below version 1.8.8, then use that one
> > If the error still happens :
> > 2) please upload your PDF to a public place
> >
> > No this shouldn't be "ignored", maybe there is a deeper problem. The
> > only PDF operator that starts with "I" is "ID" (inline image).
> >
> > Tilman
> >
> > Am 31.12.2014 um 11:18 schrieb Milan Tomic:
> >> Hello,
> >> I got this exception:
> >> Caused by: java.io.IOException: Error: Expected operator 'ID'
> actual='IH'
> >>        at
> org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:391)
> >>        at
> org.apache.pdfbox.pdfparser.PDFStreamParser.parse(PDFStreamParser.java:143)
> >> while parsing a PDF exported from SugarCRM, but I have no idea what it
> means or how to workaround this problem. Is there some setting "ignore IDs"?
> >>
> >> Thank you in advance,Milan
> >>
> >
> >
> >
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message