pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hesham G." <heshamgne...@gmail.com>
Subject Re: Extracting text from Arabic PDF - Text appears reveresed
Date Tue, 04 Jan 2011 18:09:31 GMT
I have noticed the method PDFTextStripper.inspectFontEncoding(...). Would this help with that
problem ?

Best regards ,
Hesham 

---------------------------------------------
Included message :


Hello ,

I am using PDFBox 1.4 to extract text from an Arabic PDF file. The problem with Arabic is
that it is a right to left language. The words are read reversed.
Can this be fixed ?

Here is a 1 page Arabic PDF to try it : 
http://www.4shared.com/document/k6Mrafej/arabic_book.html


Best regards ,
Hesham
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message