incubator-odf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Florian Hopf <>
Subject Re: Is there anything happening here?
Date Fri, 08 Feb 2013 11:59:44 GMT

On 08.02.2013 00:55, Nick Burch wrote:
> One thing that I think might be interesting is seeing if you could write
> a shim to allow the ODFToolkit .ods support to implement the POI
> spreadsheet interfaces. If it could, that could open up quite a few more
> users, as people who have code to target .xls and .xlsx could then also
> output .ods with only a few lines of code changed

While talking to Yegor at ApacheCon Europe I noticed that this might 
indeed be a huge opportunity. There is lots of code out there that uses 
POI and could immediately benefit from the support.

I have no idea how complicated this will be (it's been a while since I 
have used POI) but I would definitively be interested in trying this. 
This could at first be a part of the ODFToolkit and later probably be 
moved to POI.

> Another piece of integration that would be good is with Apache Tika.
> Tika currently has a slightly hacky piece of xml processing code to turn
> ODF files into XHTML. It's low memory and fast, but not easy to
> maintain. It'd be great to use some streaming support from ODFToolkit to
> make the Tika code easier and more feature-ful, but again I think there
> are some feature gaps. It should drive a lot more use though, if
> anyone's interested and able to spend a bit of time on it?

 From my understanding of the project this is far more difficult as 
there currently is no streaming support, all access is DOM based. 
Judging from my experience with the Solr extractor memory efficiency is 
a very important issue for Tika.

As the Tika API is rather simple (at least as far as I know it) it 
wouldn't be that hard to write a DOM based version but I doubt if this 
is the right approach.


Florian Hopf
Freelance Software Developer

View raw message