poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <apa...@gagravarr.org>
Subject Re: Hello! + Visio/XDGF integration tasks
Date Mon, 14 Sep 2015 21:13:23 GMT
On Fri, 11 Sep 2015, Dustin Spicuzza wrote:
> * Finish the IP clearance template (looks like most of it is done?)

I think we only have one item left now!

> * Change the copyright headers to Apache (where should I dump the code
> once this is done?)

Could you do a new dump with the updated headers? We can then review 
those, and use that as the basis for the import once we're all happy

> * Integrate the visio ooxml schema into the schema jar or spin off
> (David seems to have volunteered to do this task, but I don't see a
> separate bug for it.. my two cents, add to schema jar)

In the short term, I'd suggest we do another full jar for it, and have the 
key classes included in the poi-ooxml-schemas jar

Medium term, we should release a full ooxml-schemas 1.2 jar, with the 
visio and security bits in.

Could someone raise a bug for that, and start work on it :)

> * Integrate the vsdx code into POI -- seems like the place to put it is
> in poi-scratchpad?

Currently, scratchpad is ole2 only, so would need to in the ooxml module.

> * Can someone create an XDGF category on bugzilla? :)

Done! Please let me know if you want any further tweaks to the description 
for it, and/or HDGF to avoid confusion

> Some things that I know remain to be done in the code:
> * Create a basic text extractor like the other file formats have... I
> seem to recall there's a common interface, right?

POIXMLTextExtractor is the one you'll need to base it on. Can be quite 
easy for starters

Once that's done, we'll want to add support to Apache Tika too, based on 

> * Add some very basic unit tests -- I focused the testing on code that
> we cannot publicly release, unfortunately, instead of at the low level
> vsdx support as POI would require. Need to create some dummy sample data
> files too.

To start with, I'd suggest text extraction unit tests, then expand it from 

Adding unit tests is also something that newer community members can help 
with, so if you can tempt some non-committers to help, that's a good area!

Oh, and the website will need updating

> As I embark on some of this, I see that POI uses SVN. Is it appropriate 
> for me to create a branch there and put all of this work in that until 
> it's ready for merging (obviously after some others code review it), or 
> should I put it in some other place. The new committer docs et al aren't 
> immediately clear on this point.

Depends on how confident you are :) Generally we just work on trunk, 
keeping it stable, but we do use branches for long-lived disruptive stuff 
(eg the slideshow work recently). We're a small enough project that we 
don't need lots of branch admin stuff, we can just co-ordinate on the list 
and work it out simply :)


To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

View raw message