pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Branden Visser <mrvis...@gmail.com>
Subject Re: PdfBox can provide pdf2html convert?
Date Tue, 19 Jan 2016 11:12:31 GMT
ExtractText unfortunately isn't the same functionality as pdf2htmlex.

pdf2htmlex is intended to be a picture-perfect representation of the
PDF embeddable in the browser, including images, graphics, etc... The
ExtractText option from what I've seen is only the text and doesn't do
any of the alignment.

AFAIK, PDFBox does not intend to provide this kind of functionality anywhere.

Hope that helps,
Branden

On Tue, Jan 19, 2016 at 1:19 AM, Tilman Hausherr <THausherr@t-online.de> wrote:
> Am 19.01.2016 um 03:54 schrieb admin:
>>
>> Hi guys!
>>
>>      I would like to ask PdfBox can provide pdf2html convert?  like
>> pdf2htmlEx.
>>           pdf2htmlEx is dependent on the operating system too much
>>           If PDFbox provide this functionality would be great.
>>
>> --
>> tianyi li
>>
>>
>>
>
> Just use the ExtractText command line utility, with the "-html" option.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message