poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Justin.W.Coo...@wellsfargo.com>
Subject RE: Performance Issue with POI 3.6 as compared to 2.5.1
Date Wed, 09 Jun 2010 19:28:54 GMT
This was suggested by Yegor Kozlov on Dec 7, 2008:

Yegor - I created an example demonstrating how to generate large workbooks and avoid OutOfMemory:
http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/usermodel/examples/BigGridDemo.java

If you search this list you can find his original email explaining it in more detail.

Justin

-----Original Message-----
From: K raghavendra Rao [mailto:k.raghavendra.rao@oracle.com] 
Sent: Tuesday, June 08, 2010 3:06 PM
To: user@poi.apache.org
Subject: Re: Performance Issue with POI 3.6 as compared to 2.5.1

Nick, David,
Thank you both for your response.

I got held up with other work and hence couldn't respond earlier.

Please clarify the following.
>>My hunch is that you'll find HSSFWorkbook from 
3.6 to be slightly faster than from 2.5, or otherwise little different.<<

My understanding is that, to be able to generate .xlsx files I need to use XSSFWorkbook and
NOT HSSFWorkbook. Hence that rules out the possibility of using HSSFWorkbook. Please correct
me if I am wrong.

Based on David's reply, here is what I tried.
<This is the first time I am working with NIGHTLY builds. Please correct me if my approach
is wrong>
I downloaded the poi-3.7-SNAPSHOT-20100528.jar NIGHTLY build and replaced the earlier one:
poi-3.6-20091214.jar

So now I have 2 environments with the following settings to test the PERFORMANCE between POI
3.6 and 2.5.1.

Env1:
POI version 3.6 with XSSFWorkbook (updated with the above mentioned NIGHTLY build jar. Other
3.6 jars are the same)

Env2:
POI version 2.5.1 with HSSFWorkbook

The report that I am generating has a SQL SELECT query which returns 65,000 records in 2 seconds.
Env2 provides the file in less than 10 seconds. Env1 takes around 15 mins!!

The BIG QUESTION for my project team is: Can POI EFFICIENTLY support generation of MS Excel
2007 (.xlsx) files which have more than 66,000 records? Now, I need to be able to make this
decision to accordingly convey to the management. We had migrated from .CSV files to POI due
to the user preference of native MS Excel files over CSV. 

If anybody has managed to achieve this, PLEASE HELP.

Let me know if you need any further details.

Regards,
Raghu



----- Original Message -----
From: nick.burch@alfresco.com
To: user@poi.apache.org
Sent: Tuesday, May 25, 2010 5:33:51 AM GMT -05:00 US/Canada Eastern
Subject: Re: Performance Issue with POI 3.6 as compared to 2.5.1

On Mon, 24 May 2010, K raghavendra Rao wrote:
> I was using POI 2.5.1 to generate .xls files until the record count 
> crossed the 65k+ limit set by Excel 2003. At this point, I switched to 
> POI 3.6 and to XSSFWorkbook (from the previous HSSFWorkbook)

This'll be the main cause. My hunch is that you'll find HSSFWorkbook from 
3.6 to be slightly faster than from 2.5, or otherwise little different.

XSSFWorkbook is xml based (the whole of the ooxml file format is), and 
processing it needs a bit more memory and cpu than the older binary 
format.

Otherwise, see David's reply about some recent xssf performance 
improvements

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

Mime
View raw message