pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: PDFText2HTML two-column problem
Date Wed, 09 Jul 2014 18:57:31 GMT
The best would be to open an issue in JIRA with the PDF and the HTML output.

Tilman

Am 09.07.2014 19:54, schrieb martin.lichtblau@freenet.de:
> Hello PDFBox community,
>
> our team has a nasty problem with PDFBox.
> We are using "pdfbox.util.PDFText2HTML" to convert the PDF-text-documents in HTML. For
our software it's very important, that the text in the output HTML-document is in the original/right
order.
> But that isn't the case if we have two-column PDF-Documents. Somehow PDFBox changes the
order of the in the output if we have a two-column document as input. Which means, that the
text of column two comes before the text of column one in the output HTML-document.
> Would be great if you have an idea how to fix it. Thank you!
>
> Best greetings
> Martin L.
>
>
>
> ---
> Alle Postfächer an einem Ort. Jetzt wechseln und E-Mail-Adresse mitnehmen! Rundum glücklich
mit freenetMail
>


Mime
View raw message