poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Allison <talli...@apache.org>
Subject Re: streaming detection of OLE?
Date Tue, 16 Apr 2019 19:29:15 GMT
Thank you, Dave!  The reading examples use POIFSReader, which I had hoped
was truly streaming, but it creates a POIFS, which requires a read/skip of
the entire stream IIUC, and then iterates...Or, am I missing something?

I didn’t try POIFSReader by specifying a subdoc to process, but it looks
like it opens a POIFS first no matter how you register a listener.

On Tue, Apr 16, 2019 at 3:20 PM Dave Fisher <dave2wave@comcast.net> wrote:

> Hi Tim,
>
> Maybe the answer is using HPSF -
>
> https://poi.apache.org/components/hpsf/how-to.html
>
> Regards,
> Dave
>
> > On Apr 16, 2019, at 11:47 AM, Tim Allison <tallison@apache.org> wrote:
> >
> > All,
> >  In Tika, when we do file type detection of OLE files
> > (POIFSContainerDetector), we spool the file to disk, open a POIFS and
> > make a decision based on document/directory names.  A user on
> > TIKA-2849 does not want to copy the full file from a slow network
> > drive for detection.  When I tried using a BoundedInputStream with
> > POIFS, not surprisingly, I got EOF exceptions.
> >  Question: is there any way to do detection in a streaming mode for
> > OLE files?  Or, is this the best we can do?  Thank you!
> >
> >       Best,
> >
> >                     Tim
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> > For additional commands, e-mail: user-help@poi.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message