jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: office 2007 files
Date Thu, 28 May 2009 10:32:20 GMT

On Thu, May 28, 2009 at 11:56 AM, Paul Skinner
<shedloadsofbeer@hotmail.com> wrote:
> My preferred option would therefore be to use the jackrabbit-tika component.

OK. I'll resurrect it then in the new JCR Commons subproject from
where we can release it as a standalone component.

The basic idea behind the jackrabbit-tika component is that you can
replace all your configured text extractor classes with
org.apache.jackrabbit.tika.TikaTextExtractor that will use Apache Tika
to extract text from most major file formats.


Jukka Zitting

View raw message