pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walid KRIFI <walid.kr...@gmail.com>
Subject Re: users Digest 25 Jan 2011 12:35:54 -0000 Issue 329
Date Tue, 25 Jan 2011 13:25:33 GMT
Thank you for reply,
but when I use -sort option i have incomprehensive char like this
DCC555A
3
5C77
e
577
r777ge
oa0000drlu66 6e
1
iI344t
nDe 901Dmt
.e
753e___efr
E
CCCanpDDDuta
x
PPPl:tr
p
___ S
ty
000000w
:osu
000a000p111000s000 P111A
r
777u333rB
e
222c333RhaA


Note : the document is not encrypted and the font is Arial.

Thx again.


2011/1/25 <users-digest-help@pdfbox.apache.org>

>
> users Digest 25 Jan 2011 12:35:54 -0000 Issue 329
>
> Topics (messages 1880 through 1886):
>
> Re: Parsing Problem
>        1880 by: Andreas Lehmkuehler
>
> Re: Type1C font Error
>        1881 by: Andreas Lehmkuehler
>
> Re: How to draw annotation rectangle in PDF
>        1882 by: Andreas Lehmkuehler
>        1883 by: prashant mangate
>
> Parsing Problem : words in disorder
>        1884 by: Walid KRIFI
>        1885 by: Andreas Lehmkuehler
>
> NSAutoreleaseNoPool leaking in Tomcat
>        1886 by: Alexander Chow
>
> Administrivia:
>
> ---------------------------------------------------------------------
> To post to the list, e-mail: users@pdfbox.apache.org
> To unsubscribe, e-mail: users-digest-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-digest-help@pdfbox.apache.org
>
> ----------------------------------------------------------------------
>
>
>
> ---------- Message transféré ----------
> From: Andreas Lehmkuehler <andreas@lehmi.de>
> To: users@pdfbox.apache.org
> Date: Sun, 23 Jan 2011 12:13:59 +0100
> Subject: Re: Parsing Problem
> Hi,
>
>
> Am 21.01.2011 10:49, schrieb Walid KRIFI:
>
>> Hi All,
>> When trying to extract text from PDF file i have extracted words in
>> desordre.
>> Any idea?
>>
> Sounds like a missing sort option. See [1] for further details.
>
> BR
> Andreas Lehmkühler
>
> [1] http://pdfbox.apache.org/commandlineutilities/ExtractText.html
>
>
>
> ---------- Message transféré ----------
> From: Andreas Lehmkuehler <andreas@lehmi.de>
> To: users@pdfbox.apache.org
> Date: Sun, 23 Jan 2011 12:17:09 +0100
> Subject: Re: Type1C font Error
> Hi,
>
> Am 20.01.2011 22:21, schrieb Yogesh:
>
>> Hi,
>>
>> I am still getting the error
>>
>> org.apache.pdfbox.pdmodel.font.PDFontFactory createFont
>> WARNING: Failed to create Type1C font. Falling back to Type1 font
>> java.io.IOException: The handle is invalid
>>
> Did you update your local PDFbox copy and recompile it?
>
> BR
> Andreas Lehmkühler
>
>  On 2 January 2011 13:50, Andreas Lehmkuehler<andreas@lehmi.de>  wrote:
>>
>>  Hi,
>>>
>>>
>>> Am 05.12.2010 07:31, schrieb Yogesh:
>>>
>>>  I am getting an IOException, but the StackTrace looks similar.
>>>
>>>> This does not seem to be resolved yet, or is it?
>>>>
>>>>  PDFBOX-708 is resolved in the current trunk (revision 1054449)
>>>
>>>
>>> BR
>>> Andreas Lehmkühler
>>>
>>>
>>>  On 5 December 2010 01:05, Hesham G.<heshamgneady@gmail.com>   wrote:
>>>
>>>>
>>>>  Is your problem related to this :
>>>>
>>>>> https://issues.apache.org/jira/browse/PDFBOX-708
>>>>>
>>>>> Best regards ,
>>>>> Hesham
>>>>>
>>>>>
>>>>> ---------------------------------------------
>>>>> Included message :
>>>>>
>>>>>
>>>>>  Hello,
>>>>>
>>>>>
>>>>>> I am trying to extract text from a set of PDF files. I keep getting
>>>>>> the
>>>>>> following error for some of the files.
>>>>>>
>>>>>> Dec 4, 2010 7:50:19 PM org.apache.pdfbox.pdmodel.font.PDFontFactory
>>>>>> createFont
>>>>>> WARNING: Failed to create Type1C font. Falling back to Type1 font
>>>>>> java.io.IOException: The handle is invalid
>>>>>> at java.io.RandomAccessFile.seek(Native Method)
>>>>>> at
>>>>>> org.apache.pdfbox.io.RandomAccessFile.seek(RandomAccessFile.java:59)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96)
>>>>>> at java.io.BufferedInputStream.fill(Unknown Source)
>>>>>> at java.io.BufferedInputStream.read1(Unknown Source)
>>>>>> at java.io.BufferedInputStream.read(Unknown Source)
>>>>>> at java.io.FilterInputStream.read(Unknown Source)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.pdmodel.font.PDType1CFont.loadBytes(PDType1CFont.java:429)
>>>>>> at
>>>>>>
>>>>>> org.apache.pdfbox.pdmodel.font.PDType1CFont.load(PDType1CFont.java:318)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:123)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:124)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:76)
>>>>>> at
>>>>>> org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:115)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:243)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:441)
>>>>>> at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:365)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:321)
>>>>>> at
>>>>>>
>>>>>> org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:241)
>>>>>> at litexpr.text.PDFReader.readPage(PDFReader.java:96)
>>>>>> at litexpr.Main2.main(Main2.java:51)
>>>>>>
>>>>>> How can I add these fonts, whatever they are? Please help.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -Yogesh
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>
>
>
> ---------- Message transféré ----------
> From: Andreas Lehmkuehler <andreas@lehmi.de>
> To: users@pdfbox.apache.org
> Date: Sun, 23 Jan 2011 15:52:24 +0100
> Subject: Re: How to draw annotation rectangle in PDF
> Hi,
>
> Am 20.01.2011 08:11, schrieb prashant mangate:
>
>> Hi,
>>
>>
>> I want to draw the rectangle on a existing PDF as a highlighter.
>>
>> Existing PDF contains the table. and I want to highlight its cell by using
>> following code. But it display over the cell. It should looks like
>> transperent. (i.e highlighter)
>>
>> contentStream.setNonStrokingColor(Color.pink);
>> contentStream.addRect(startX, startY+startY, width, height);
>> contentStream.fillRect(1.0f, 1.0f, 1.0f, 1.0f);
>>
> I guess your are looking for a Text markup Annotation. [1] provides some
> samples
> for different types of annotations.
> Have a look at chapter 12.5.6 Annotations types from [2] to learn more
> about
> annotations.
>
> BR
> Andreas Lehmkühler
>
> [1]
>
> http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/examples/pdmodel/Annotation.java
> [2]
> http://www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf
>
>
>
>
> ---------- Message transféré ----------
> From: prashant mangate <prashant.mangate@gmail.com>
> To: users@pdfbox.apache.org
> Date: Mon, 24 Jan 2011 11:11:05 +0530
> Subject: Re: How to draw annotation rectangle in PDF
> Hi,
>
> Thanks for your kind reply.
>
> But, i dont want annotation. I want to inherit the annotation feature.
> Like, If i draw rectangle with color, so i will be able to set the color
> opacity & text behind the rectangle should display.
>
>
>
>
>
> On Sun, Jan 23, 2011 at 8:22 PM, Andreas Lehmkuehler <andreas@lehmi.de
> >wrote:
>
> > Hi,
> >
> > Am 20.01.2011 08:11, schrieb prashant mangate:
> >
> >  Hi,
> >>
> >>
> >> I want to draw the rectangle on a existing PDF as a highlighter.
> >>
> >> Existing PDF contains the table. and I want to highlight its cell by
> using
> >> following code. But it display over the cell. It should looks like
> >> transperent. (i.e highlighter)
> >>
> >> contentStream.setNonStrokingColor(Color.pink);
> >> contentStream.addRect(startX, startY+startY, width, height);
> >> contentStream.fillRect(1.0f, 1.0f, 1.0f, 1.0f);
> >>
> > I guess your are looking for a Text markup Annotation. [1] provides some
> > samples
> > for different types of annotations.
> > Have a look at chapter 12.5.6 Annotations types from [2] to learn more
> > about
> > annotations.
> >
> > BR
> > Andreas Lehmkühler
> >
> > [1]
> >
> >
> http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/examples/pdmodel/Annotation.java
> > [2]
> >
> http://www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf
> >
> >
>
>
> --
> Thanks & regards
> Prashant Mangate  (プロシャント・マングテ)
> Software Engineer
> Softbridge Solutions (India) Pvt Ltd
> Unit #103, Tower #S4 Cybercity
> Magarpatta City, Hadapsar, Pune 411028
> Mobile: (91) 9421685015
> Email: prashant.m@softbridge-s.com         URL: www.softbridge-s.com
>
>
> ---------- Message transféré ----------
> From: Walid KRIFI <walid.krifi@gmail.com>
> To: users@pdfbox.apache.org
> Date: Mon, 24 Jan 2011 09:53:12 +0100
> Subject: Parsing Problem : words in disorder
> Please Help,
> When I parse PDF with PDFBox I have the output text but words are in
> disorder.
> when i extract text with Acrobat all is gone fine.
>
> Thx.
>
>
> ---------- Message transféré ----------
> From: "Andreas Lehmkühler" <andreas@lehmi.de>
> To: users@pdfbox.apache.org
> Date: Mon, 24 Jan 2011 10:48:26 +0100 (MET)
> Subject: Re: Parsing Problem : words in disorder
> Hi,
>
> Gesendet: Mo, 24. Jan 2011
> Von: Walid KRIFI<walid.krifi@gmail.com>
>
> > Please Help,
> > When I parse PDF with PDFBox I have the output text but words are in
> > disorder.
> > when i extract text with Acrobat all is gone fine.
> Please avoid double postings. I already tried to answer your question
> yesterday [1]
>
> BR
> Andreas Lehmkühler
>
> [1] http://markmail.org/message/twyzamchxqmdgqr5
>
>
>
> ---------- Message transféré ----------
> From: Alexander Chow <alexander.chow@liferay.com>
> To: users@pdfbox.apache.org
> Date: Tue, 25 Jan 2011 12:35:16 +0000
> Subject: NSAutoreleaseNoPool leaking in Tomcat
> Hi there,
>
>
> I have been playing around with pdfbox to do some PDF processing. If I am
> running pdfbox from a standalone Java application, it runs fine. However, if
> I used it from within Tomcat, I get these logs:
>
>
> 2011-01-25 12:22:41.485 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x15d200f40 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.507 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x10063ca70 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.547 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x100666830 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.557 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x10063e5c0 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.602 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x100167b20 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.617 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x10011fb60 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.760 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x100677310 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.765 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x15d24e690 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.879 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x100644500 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
> 2011-01-25 12:22:41.887 java[33334:60f] *** __NSAutoreleaseNoPool(): Object
> 0x10063ebe0 of class NSConcreteMapTableValueEnumerator autoreleased with no
> pool in place - just leaking
>
>
> I figured it was because pdfbox needed to be run in headless mode, so I
> tried setting my environment to have:
>
>
> CATALINA_OPTS=-Djava.awt.headless=true
>
>
>
> Unfortunately, that didn't seem to help much either.
>
>
> Here's my OS X java --version, if you are interested (it's the latest
> update for Snow Leopard):
>
>
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-10M3261)
> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>
>
>
> Any thoughts on this?
>
>
>
>
> Cheers,
> Alex
>
>
>
>


-- 
------------------
Cordialement
  Krifi Walid

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message