poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Donahue, Michael" <michael.dona...@pearson.com>
Subject RE: HSSF - Generating large spreadsheets in streaming manner?
Date Fri, 12 Jan 2007 19:56:31 GMT
Sanjiv -

Strictly speaking as a web developer, this is a very bad approach to
dealing with a task that may take more than a second or two complete.
Typically, a web application should never make the user wait for more
than a few seconds to completely load the next page.  I don't think I
would want to take your approach for a think client either.

In the situation you described, it would be better to tell the user that
their request has been accepted and as soon as it is complete they will
be notified through some other mechanism that they can download or view
the results.  This could be through a screen/window pop.

There still might be a few good places that it might make sense to have
a streaming API, I'd prefer to see effort spent on tasks that have a
broader utilization curve like the recently added comments support.

Lastly; "THANK YOU!!" to all of the POI Project developers for all of
their efforts to make POI better.

-----Original Message-----
From: Sanjiv Jivan [mailto:sanjiv.jivan@gmail.com] 
Sent: Friday, January 12, 2007 12:41 PM
To: POI Users List; acoliver@apache.org
Subject: Re: HSSF - Generating large spreadsheets in streaming manner?

I think that having a streaming API would be very useful and its not
because
of trying to generate a massive non human readable spreadsheet. You have
to
factor in the time it takes to build that data to be used for the
spreadsheet too.

Consider a use case where a user is trying to download a spreadsheet
with
500 - 1000 rows but the logic involved in getting the data for the
spreadsheet takes around a minute. Without a streaming API, when a user
tries to download such a file they click on the link and basically the
browser waits for 1 minute and only then pops up a save dialog since the
contents of the spreadsheet could only be written out to the response
stream
after the entire spreadsheet was generated. Had there been a streaming
API,
the contents could have been written to the response stream on the fly
and a
nice download dialog with progress bar would have displayed by the
browser.


On 3/10/06, Andrew C. Oliver <acoliver@apache.org> wrote:
>
> not yet.  Demand for the cocoon serializer hasn't been very high so it
> is mostly deprecated (unless there is some massive uptake of support
for
> it).
>
> Okay its time for my yearly rant on this subject (not aimed at
you...you
> just reminded me I hadn't done it this year):
>
> I'm always a little curious about this.  XLS is a HORRIBLE format
(which
> is why I started POI, I wanted to do something difficult).  It is a
> HORRIBLY inefficient format and WAS NOT DESIGNED to stream.  Yet
people
> generate massive sheets in it.  My pensiveness is that no human is
> likely to read such a large sheet or be able to do anything
patricularly
> useful with it.  So who are these sheets for?  Often it turns out they
> are some kind of data transfer, which is frankly BAFFLING.  Why?
> Because I could do the same transfer with like 1/10th of the storage,
> bandwidth, CPU, etc in a more well-thought out (or at least
lightweight)
> format.  Yet I saw a spreadsheet today that was 100mb.  The power of
> Excel is that it can style the data and use some formulas.  This is
good
> for what is to me a summary report and not RAW 100m or gigs of data..
> Of course this comes from someone who knows how to hack the underlying
> binary structures but barely knows how to run the Excel GUI.   :-)
>
> We now return you to your previously scheduled mail list activity.
>
> -Andy
>
> PS.  I wish the open office GUI wasn't so crappy, sluggish and
> well...cruddy looking and printed nicely.  Their file formats make so
> much more sense (and with compression they're reasonably efficient)
and
> the brilliance of text is that it works nicely with revision control
and
> revision control tags.
>
> PPS.  I also wish the open office developers would either learn C++,
> convert all of their code to C and/or port open office to a language
> they know how to write better structured code in.
>
> Brule, Jon wrote:
> > Is it possible to generate a very large spreadsheet (e.g. several
> > thousand rows) in a low-memory, streaming manner? I am looking for a
> > corollary to the event model used to parse large spreadsheets.
> >
> > If not, I assume that the Cocoon serializer, which I understand uses
> > HSSF, would not operate in a streaming manner either...
> >
> > Thank you.
> >
> > Regards,
> > Jon
> > _________________
> > Jon R. Brule
> > Paramount Computing Associates
> >
> >
---------------------------------------------------------------------
> > To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> > Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
> > The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/
> >
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/
>
>
**************************************************************************** 
This email may contain confidential material. 
If you were not an intended recipient, 
Please notify the sender and delete all copies. 
We may monitor email to and from our network. 
****************************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/


Mime
View raw message