poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Grabfelder <nos...@kaigrabfelder.de>
Subject Re: How to "clone" values of a workbook
Date Thu, 12 May 2016 20:15:40 GMT
Hi Mark,

we are generating a WEB UI (tables) based on data in the calculated workbook. Basically we
are iterating over
all cells in all rows of all sheets and transforming them to HTML.

We did some performance analysis and the bad part was always the instantiation of the workbook
- depending on
the formulas used in the workbook (e.g. workbooks using VLOOKUP are pretty bad for example).

I like your idea of not reusing the instances but just creating new instances (in advance)
- we'll have to
discuss that - thx ;-)

Best Regards

Kai

Murphy, Mark schrieb am 12.05.16 um 16:09:
> Hmm... I'm not real sure what you mean by "read the data from excel and transform them
to an output format" are you using POI or an alternate tool to do that? It seems to me that
this might be the bottleneck rather than the create the workbook part. But if "create a new
instance of the workbook" is the problem, maybe you just need to preload some workbooks, and
when you use one, spawn a thread to load another from the database. That way you don't have
to go through the trouble of doing a reset, and the create happens before the request is made.
I am saying this based on the assumption that your pooling solution eventually has all the
documents loaded in the pool at least once. If this is the case you might as well preload
them and avoid the long wait for the requester of the first document of each type.
> 
> -----Original Message-----
> From: Kai Grabfelder [mailto:nospam@kaigrabfelder.de] 
> Sent: Wednesday, May 11, 2016 6:05 PM
> To: POI Users List
> Subject: Re: How to "clone" values of a workbook
> 
> Hi Mark,
> 
> in our system we have a limited number of workbooks that we mainly use for calculation.
The process withouth pooling looks like this
> 
> - create a new instance of the workbook (one of about 20 different workbook stored in
the database)
> - fill in some data (entered by the user)
> - recalculate formula (incl. creating a new formula evaluator)
> - read the data from excel and transform them to an output format (the workbook in the
database is not changed it's kind of a template and the main purpose of poi/excel is to perform
calculations)
> 
> The creation of the workbook takes 200ms till 15 seconds which is too slow for the required
use case (it's online processing in a web application).
> 
> The process with pooling looks like this:
> 
> - ask the pool to either acquire an instance of the workbook
> - the pool checks if there is an idle instance of the workbook available in the pool
> -- if not a new instance of the workbook (and the formula evaluator) is created and put
in the pool
> -- if yes the idle instance is returned
> - fill in some data (entered by the user)
> - recalculate formula (without creating a new formula evaluator
> - read the data from excel and transform them to an output format
> - ==> "reset" the instance to it's state when acquired from the pool <== I'm currently
thinking how to implement this - I guess I'll have to "remember" all changes to the workbook
and "revert" them at this step.
> - return the instance to the pool
> 
> I think SXSSF is not an option as imho it doesn't support formula evaluation. In any
way memory is not a big constraint here - the issue is the time that the system requires for
instantiating the workbook instances.
> 
> Best Regards
> 
> Kai
> 
> Murphy, Mark schrieb am 11.05.16 um 16:19:
>> I am trying to understand just what it is you are trying to do
>> 
>>>> because the creation of workbook instances is pretty costly (about 200ms
to 15 seconds for the workbooks we are dealing with) we implemented a pooling of workbook
instances which seems to work pretty well.
>> 
>> What do you mean by this? How does it work?
>> 
>>>> Since we are only performing limited write-operations on the workbook instances
(basically only calling Cell.setCellType() and Cell.setCellValue()) we need a way of "cloning"
the original workbook values when retrieving a workbook instance from the pool, remembering
them and "reseting" the workbook instance when returning it to the pool by setting all those
remembered values to the workbook instance.
>> 
>> Are you holding a group of workbooks in memory, then using them as templates to write
new workbooks with changed values? Or are you changing the values and saving it with a new
file name?
>> 
>>>> I'm wondering if there is already a functionality like this in poi or if
somebody already did sth. similar? Otherwhise I'll have to write sth. by my own (which shouldn't
be too hard) - but I thought that I'll ask anyway.
>> 
>>>> Performing a deep clone of the complete workbook is not an option as this
would make the whole pooling idea pointless.
>> 
>> It may be possible to write your new workbook using SXSSF which does not hold everything
in memory so long. You can specify how many rows to keep in memory to help control the amount
of memory used.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org For additional 
>> commands, e-mail: user-help@poi.apache.org
>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org For additional commands, e-mail:
user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message