poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avik Sengupta <Avik.Sengu...@itellix.com>
Subject Re: Using HSSF to parse Excel
Date Wed, 12 Nov 2003 16:52:53 GMT
You will undoubtedly achieve tremendous speed and stability improvements
moving from VBA to POI. 

Whether to use eventmodel or not depends on how low level you want to
go.. it'll provide memory benefits, but usermodel is order or magnitude
easier to code. 

The important question is if POI can handle all the features in xl that
you need. That is unfortunately impossible for us to answer. For
example, rich text formats... while POI has much improved support for
rich text since that FAQ was written, its logically impossible to verify
that is supports every bit of rich text in every xl file out there..
such are the travails of working without written specs. 

So what we usually suggest  is "if it passes your tests, its good enuf
for you". So if your requirement is to be able to flawlessly process
every possible xl sheet that you can throw at it, then POI is not for
you. If however, you can create a reasonable subset that you can test,
its perfect for you job. So get POI to open your spreadsheets one by
one, and see how it goes. 

As for Open/Star office, its theortically possible to use UNO etc, but
from whatever I have heard, its not very easy to set up. It will of
course also have the drawback that again, there can be no theoretical
guarantee that it will be able to process every xl file out there. Only
excel can guarantee that it will process ALL excel files properly
(well... almost :)


On Wed, 2003-11-12 at 22:20, Cope, Christopher wrote:
> I am working on a system that automatically extracts data from .xls files,
> performs manipulation of the data and then inserts the manipulated data into
> an Oracle database.
> There are numerous sets of data that we need to extract from different .xls
> files, and the Excel spreadsheets themselves come in a number of different
> formats - single worksheets, multiple worksheets, some containing macros,
> formulae etc. The data items that we need to extract can therefore be in
> various different places within a spreadsheet.
> The data extraction process is written in Java and to handle the complexity
> of where to find each data item we are using a Java rules engine.
> Currently we do not access the .xls files themselves with Java, instead we
> use the Runtime object to kick off an external VB process. The VB process
> uses the Excel 2002 XML support to save the .xls files into Microsoft's XML
> format. The Java then resumes using JAXP to read the XML files.
> We have encountered various problems with VB processes failing to terminate
> and are also keen to streamline things by keeping it all as one Java
> process. We thus want to refactor the .xls file reading process to use Java.
> So my question(s):
> Is HSSF's event model the best API to achieve this?
> If so will the fact that the spreadsheets typically have lots of formatting
> cause problems? (see http://jakarta.apache.org/poi/faq.html Q.14)
> If not what else could be used? Would it be possible to access the .xls
> files using Star Office's Universal Network Objects?
> Any comments gratefully received.
> Thanks
> Chris
> This e-mail and any attachment is for authorised use by the intended recipient(s) only.
It may contain proprietary material, confidential information and/or be subject to legal privilege.
It should not be copied, disclosed to, retained or used by, any other party. If you are not
an intended recipient then please promptly delete this e-mail and any attachment and all copies
and inform the sender. Thank you.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-user-help@jakarta.apache.org

To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org

View raw message