poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjiv Jivan" <sanjiv.ji...@gmail.com>
Subject Re: HSSF - Generating large spreadsheets in streaming manner?
Date Fri, 12 Jan 2007 18:40:52 GMT
I think that having a streaming API would be very useful and its not because
of trying to generate a massive non human readable spreadsheet. You have to
factor in the time it takes to build that data to be used for the
spreadsheet too.

Consider a use case where a user is trying to download a spreadsheet with
500 - 1000 rows but the logic involved in getting the data for the
spreadsheet takes around a minute. Without a streaming API, when a user
tries to download such a file they click on the link and basically the
browser waits for 1 minute and only then pops up a save dialog since the
contents of the spreadsheet could only be written out to the response stream
after the entire spreadsheet was generated. Had there been a streaming API,
the contents could have been written to the response stream on the fly and a
nice download dialog with progress bar would have displayed by the browser.


On 3/10/06, Andrew C. Oliver <acoliver@apache.org> wrote:
>
> not yet.  Demand for the cocoon serializer hasn't been very high so it
> is mostly deprecated (unless there is some massive uptake of support for
> it).
>
> Okay its time for my yearly rant on this subject (not aimed at you...you
> just reminded me I hadn't done it this year):
>
> I'm always a little curious about this.  XLS is a HORRIBLE format (which
> is why I started POI, I wanted to do something difficult).  It is a
> HORRIBLY inefficient format and WAS NOT DESIGNED to stream.  Yet people
> generate massive sheets in it.  My pensiveness is that no human is
> likely to read such a large sheet or be able to do anything patricularly
> useful with it.  So who are these sheets for?  Often it turns out they
> are some kind of data transfer, which is frankly BAFFLING.  Why?
> Because I could do the same transfer with like 1/10th of the storage,
> bandwidth, CPU, etc in a more well-thought out (or at least lightweight)
> format.  Yet I saw a spreadsheet today that was 100mb.  The power of
> Excel is that it can style the data and use some formulas.  This is good
> for what is to me a summary report and not RAW 100m or gigs of data..
> Of course this comes from someone who knows how to hack the underlying
> binary structures but barely knows how to run the Excel GUI.   :-)
>
> We now return you to your previously scheduled mail list activity.
>
> -Andy
>
> PS.  I wish the open office GUI wasn't so crappy, sluggish and
> well...cruddy looking and printed nicely.  Their file formats make so
> much more sense (and with compression they're reasonably efficient) and
> the brilliance of text is that it works nicely with revision control and
> revision control tags.
>
> PPS.  I also wish the open office developers would either learn C++,
> convert all of their code to C and/or port open office to a language
> they know how to write better structured code in.
>
> Brule, Jon wrote:
> > Is it possible to generate a very large spreadsheet (e.g. several
> > thousand rows) in a low-memory, streaming manner? I am looking for a
> > corollary to the event model used to parse large spreadsheets.
> >
> > If not, I assume that the Cocoon serializer, which I understand uses
> > HSSF, would not operate in a streaming manner either...
> >
> > Thank you.
> >
> > Regards,
> > Jon
> > _________________
> > Jon R. Brule
> > Paramount Computing Associates
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> > Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
> > The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/
> >
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message