pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: is it possible to batch extract text from pdf files within a tree of folders within a zip file ?
Date Sun, 01 May 2016 06:10:26 GMT
Am 01.05.2016 um 03:06 schrieb David Green:
> sorry for using wrong forum
> is there a tika forum ?

https://mail-archives.apache.org/mod_mbox/tika-user/


>
> your suggested command is working of a fashion
> java -jar c:\jars\tika-app-1.12.jar -J -t -i f: -o g:
> the directory structure is being reproduced but the zip files are being
> copied as zip files (I think)
> the copied files retain the original filename (including the original zip
> extension) with an additional json extension
> though when I try to open the file using B1 file archiver, it reports a
> corrupt file.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message