jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: office 2007 files
Date Thu, 28 May 2009 09:49:50 GMT

On Thu, May 28, 2009 at 11:40 AM, Paul Skinner
<shedloadsofbeer@hotmail.com> wrote:
> If both poi msoffice text extractor and Apache Tika Office 2007 support is
> being targeted at 2.0 then does this mean that anyone using 1.x will not be
> able to index Office 2007 docs?

If you are running on Java 5 or higher, you can still apply the
JCR-1887 patch to get Office 2007 support in Jackrabbit 1.6. The main
problem with the change is that it causes Jackrabbit   not to compile
on Java 1.4, that's still the base platform for the Jackrabbit 1.x
releases. Unless someone comes up with a way to fix this (the Tika
option provided a workaround, but there are other issues with that
approach, see JCR-1878), we'll need to rely on people patching the
sources themselves.

Alternatively, we can restore the jackrabbit-tika component I had
earlier in the Jackrabbit sandbox and release that for people who want
Office 2007 support (and all the other good Tika stuff) without having
to patch sources. The nice thing about this option is that it would
work also for many previous Jackrabbit 1.x releases.

What would work best for you?


Jukka Zitting

View raw message