poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murphy, Mark" <murphym...@metalexmfg.com>
Subject RE: 2006 ML format?
Date Wed, 23 Nov 2016 21:22:29 GMT
Without looking, can we use that code to read and modify it to allow writing a 2006ML document
as a single XML document? I have no opinion on the read only parser.

-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org] 
Sent: Wednesday, November 23, 2016 2:38 PM
To: POI Developers List <dev@poi.apache.org>
Subject: RE: 2006 ML format?

All,
  I went it alone for the 2006ml format on Tika, see details [1].  If you have any feedback
on that bit of code, I'd appreciate it!
 
Major questions:
1) Do we want to move some/most of that into POI for 2006ml?
2) Do we want to offer a streaming read-only XWPF parser based on that code for the regular
docx?

Cheers,

         Tim

[1] https://issues.apache.org/jira/browse/TIKA-2179?focusedCommentId=15691150&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15691150

-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org]
Sent: Monday, November 21, 2016 7:14 AM
To: POI Developers List <dev@poi.apache.org>
Subject: RE: 2006 ML format?

Y, I experimented with adding an InlineOPCPackage; I couldn't quite get it to work, and even
if I did, it makes a mess of our OPCPackage and ZipPackage.

I'm thinking I might use this as a reason to build a beanless SXWPF read-only SAX parser.
 I suspect that we could very easily re-use whatever I develop for this format on the "modern"
ooxml...suspicions have been wrong before...only code and unit tests will tell. :)


-----Original Message-----
From: Mark Murphy [mailto:jmarkmurphy@gmail.com]
Sent: Saturday, November 19, 2016 5:19 PM
To: POI Developers List <dev@poi.apache.org>
Subject: Re: 2006 ML format?

Wow, this is nothing like what I thought it would be. I discovered that you can write a document
in this format by selecting save as xml document.

On Fri, Nov 18, 2016 at 7:03 AM, Allison, Timothy B. <tallison@mitre.org>
wrote:

> Thank you, Javen.  I worry that I'll be adding duct tape to 
> OPCPackage, but let me put together a patch and we can decide if 
> adding an InlinePackage is too Frankenstein-y for POI.
>
> -----Original Message-----
> From: Javen O'Neal [mailto:javenoneal@gmail.com]
> Sent: Thursday, November 17, 2016 5:58 PM
> To: POI Developers List <dev@poi.apache.org>
> Subject: Re: 2006 ML format?
>
> This would probably be of interest to users of POI who are not 
> necessarily using Tika.
>
> If someone spends the effort to add support for a Microsoft Office 
> format, POI seems like a better host.
>
> On Nov 17, 2016 10:55 AM, "Allison, Timothy B." <tallison@mitre.org>
> wrote:
>
> All,
>   On TIKA-2179 [1], Sean Story submitted a document that appears to be 
> a
> 2006 ML format .xml file.  It appears to inline the components of a 
> regular docx into a single xml file, no zip.  Is it worth the effort 
> to build a read-only subclass of OPCPackage (say, InlinePackage) that 
> would parallel our ZipPackage?  Or would it be better to handle this 
> purely on the Tika side and rewrite the file as a temporary ZipFile 
> that can be read by our current OPCPackage?
>   Thank you.
>
>            Best,
>
>                    Tim
> [1] https://issues.apache.org/jira/browse/TIKA-2179
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org For additional commands, e-mail: dev-help@poi.apache.org

B KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB  [  X  ܚX KK[XZ[
 ] ][  X  ܚX P K \X K ܙ B  ܈Y][ۘ[  [X[  K[XZ[
 ] Z[ K \X K ܙ B B

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

Mime
View raw message