pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renaud Billen <renaudbil...@nic.be>
Subject Content of pdf moved around
Date Sat, 10 Jan 2015 13:04:02 GMT
Hello,

I have a little issue with the extraction of the text of some pdfs, where some words are switching
order with others..

With the pdf attached to this mail, if I use "save as text » from adobe reader, I get : 

Référence: LIX-673LIX-6737 


Nom: The test company 


Type: 
Ouverture: 24/04/2007 

Titulaire: BD 
Resp.: LIX 
Co-Resp.: BB 
Client 




But with pdfbox I get : 

Référence: LIX-6737
Nom: The test company
Titulaire: BD
Resp.: LIX
Co-Resp.: BB
Type:
Ouverture: 24/04/2007
Client


Could you tell me if something can be done to solve this problem?

Thanks,
Renaud



Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message