poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MSB <markbrd...@tiscali.co.uk>
Subject Re: Populating EXCEL Templates using configuration files
Date Fri, 12 Mar 2010 15:22:49 GMT

Hello Richard,

I understand now what it is that you are trying to accomplish but can still
offer no help or advice I am afraid. It does seem strange that the
POIFSReader is failing to return all of the content but, as you are reading
the binary formatted files, I think that the only option available to you is
to use the POIFS classes; they wil take care of locating and extracting the
data for you. As you may be aware, the binary files consist of a series of
streams that contain data and pointers to thata data. Finally, as if that
were not enough, the file is block structured so that it is difficult to
locate a extract any data without the help of the POIFS infrastructure.

Sadly, I have absolutley no experience with the POIFSReader and the
eventfilesystem, all of my work with the api has been with the usermodel and
I cannot offer any help at all I afraid. It may be worthwhile starting an
additional thread asking for help with the POIFSReader specifically in the
title. That is more likely, in my opinion, to attract the attention of
someone who has used this part of the api.

Yours

Mark B


Richard Holmes-2 wrote:
> 
> Hi Mark,
> 
> I am definatly using the Excel 2003 binary format not the XML format. 
> What
> I am referring to is a little used feature introduced in Excel 2003 which
> allows you to embed XSD (XML Schema) inside Excel and map the XML schema
> to
> different areas in the worksheet.
> 
> These mappings are stored as a root entry under the name XML and the byte
> entry is the XML schema and bindings.
> 
> Once I have read the sheet into HSSFWorksheet I can get to the root
> element
> XML using getEntry("XML") but there is no way of getting to the content of
> the Element that I can see.  So I am using the POIFSReader to extract the
> XML Schema on load, but it seems that this does not retrieve all the XML
> entry only a portion of it.
> 
> Is there another way of getting to the contents of the Entry? or is there
> a
> reason that the XML is being truncated in the POIFSReader, I can not see
> any
> reason for this.
> 
> Thanks
> Richard
> 
> On 12 March 2010 07:35, MSB <markbrdsly@tiscali.co.uk> wrote:
> 
>>
>> Richard, I could very well be wrong so do not take my word as being the
>> last
>> statement of fact on this point but you are using the wrong tool to parse
>> the xml. The POIFSFileSystem and everything related to it is aimed at
>> parsing the binary file format not the xml based file format. I would
>> suggest that you are using the wrong tool to process the file here as the
>> OPCPackage classes and eveything related to them are usually used to
>> process
>> the xml based Excel files. Just as an example, take a look at the
>> constructors for the HSSFWorkbook class and the XSSFWorkbook class; the
>> former relate to the binary file format, the latter to the xml based file
>> format.
>>
>> Can I ask, did you identify which version of the Microsoft xml file
>> format
>> you were working with? If you are having to work with the Office 2003 xml
>> file format then I would still argue that building your own parser around
>> something like Xerces is the way to go.
>>
>> Of course, I could very well be wrong about this, I frequently am.
>>
>> Yours
>>
>> Mark B
>>
>>
>> Richard Holmes-2 wrote:
>> >
>> > So fare just getting the responses has helped me look elsewhere.
>> >
>> > By using the POIFSReaderListener i found that XML schema and datastores
>> > are
>> > stored in the root filesystem under the name XML.  Reading this gives a
>> > standard Microsoft XML, can't find the link just now but it is defined
>> as
>> > a
>> > schema part and a mapping part.
>> >
>> > I have been able to delete the XML entry in the root filesystem and
>> > replace
>> > with my modified version containing a new XML location and having a
>> macro
>> > Auto_open() to refresh XML data.  I'll post more later about this.
>> >
>> > So the problem I am facing now is that when reading the XML using the
>> > POIFS
>> > reader
>> >
>> >         DocumentInputStream istream = event.getStream();
>> >         try {
>> >             int size = istream.available();
>> >             this.xmlSource = new byte[size];
>> >             istream.read(this.xmlSource);
>> >         }
>> >         catch ( IOException e ) {
>> >             // Do Nothing, shouldn't happen.
>> >         }
>> >
>> > it seems to cut the xml source short so at the end i just get "......"
>> I
>> > was
>> > wondering if there was a reason or if there was another way to get to
>> XML
>> > entry in the root?
>> >
>> > Thanks
>> > Richard
>> >
>> >
>> > On 2 March 2010 17:50, MSB <markbrdsly@tiscali.co.uk> wrote:
>> >
>> >>
>> >> It sounds as though you may be looking at the older xml format then.
>> This
>> >> is
>> >> equally easy to test; you should be able to open the file using a
>> simple
>> >> text ediot - notepad for example if you are running under Windows.
>> >>
>> >> The best bet for working directly with the xml if this is the case
>> would
>> >> be
>> >> something like Xerces which would offer you the choice between a
>> stream
>> >> or
>> >> serial parser such as SAX and the document object model or DOM.
>> >>
>> >> Yours
>> >>
>> >> Mark B
>> >>
>> >> PS Your best source of information about this file format - if it is
>> >> indeed
>> >> the earlier version - will be Microsoft. They have a gooly number of
>> >> articles available at MSDN.
>> >>
>> >>
>> >> Richard Holmes-2 wrote:
>> >> >
>> >> > Hi Mark,
>> >> >
>> >> > Thank you for your help, I had a quick try of your method but could
>> not
>> >> > open
>> >> > the sheet at all so I think we must be using a different format.
>> >> >
>> >> > Looking at the properties using the POI filesystem I see XML but it
>> is
>> >> an
>> >> > unsupported property.
>> >> >
>> >> > For the time limit I am going to look at the other suggestion using
>> >> > velocity.
>> >> >
>> >> > Many thanks
>> >> > Richard
>> >> > On 1 March 2010 16:11, MSB <markbrdsly@tiscali.co.uk> wrote:
>> >> >
>> >> >>
>> >> >> Richard, I just re-read your post and wanted to ask whether you
>> knew
>> >> that
>> >> >> there were two different 'versions' of Excel's xml file format.
The
>> >> >> earlier
>> >> >> format was introduced for, I think, version 2003 and so you need
to
>> be
>> >> >> careful to identify just which 'version' of the xml file format
you
>> >> are
>> >> >> using. The one I am familiar with, and which is supported by POI,
>> is
>> >> the
>> >> >> format that is documented in ECMA376 and, as my earlier post
>> >> indicates,
>> >> >> is
>> >> >> a
>> >> >> zipped archive consisting of several related files and folders.
The
>> >> >> earlier
>> >> >> version uses a much simpler format and it is possible to identify
>> >> which
>> >> >> one
>> >> >> you have by opening the file using a simple text editor. If the
>> file
>> >> >> conforms to the earlier format then you will be able to open the
>> file
>> >> >> with
>> >> >> a
>> >> >> text editor and view the raw xml directly; obviously this will
not
>> be
>> >> the
>> >> >> case if you are dealing with the later - version 2007 and later
-
>> >> version
>> >> >> of
>> >> >> the file format.
>> >> >>
>> >> >> Yours
>> >> >>
>> >> >> Mark B
>> >> >>
>> >> >> PS The code I included in my last post will NOT work for the
>> earlier
>> >> file
>> >> >> format.
>> >> >>
>> >> >>
>> >> >> Richard Holmes-2 wrote:
>> >> >> >
>> >> >> > Hi All,
>> >> >> >
>> >> >> > A client of mine is looking to export a java object model
to
>> various
>> >> >> EXCEL
>> >> >> > sheets.  These sheets are not of any particular format or
layout
>> and
>> >> >> they
>> >> >> > would like to be able to use some form of configuration files
to
>> >> >> populate
>> >> >> > the sheets how they wish, the EXCEL version being used is
2003.
>> >> >> >
>> >> >> > The options I am looking into are:
>> >> >> > 1) Use the XML features in EXCEL to populate the workbook
- but
>> in
>> >> this
>> >> >> > case
>> >> >> > I can not find a way of updating the XML source in Java, in
.net
>> is
>> >> it
>> >> >> > ImportXML.
>> >> >> > 2) Use some sort of configuration take the java or XML object
and
>> >> >> populate
>> >> >> > the sheets/cells in Excel, I have tried jxls but this does
not
>> have
>> >> the
>> >> >> > flexibility other than this I have come up short.  (I am looking
>> >> into
>> >> >> > using
>> >> >> > mono but I don't believe this is possible either).
>> >> >> > 3) Use EXCEL macros to re-format the data once in the sheet,
but
>> >> again
>> >> >> > this
>> >> >> > requires the users to perform actions, such as allowing macros
to
>> be
>> >> >> run.
>> >> >> > Some of our users find this a complex proceedure :)
>> >> >> >
>> >> >> > I have thought of writing my own engine but the deadlines
are too
>> >> >> short.
>> >> >> >
>> >> >> > If anyone has any ideas I would be very greatful, the easiest
for
>> me
>> >> >> would
>> >> >> > be to find a way of updating XML or find an engine that uses
a
>> >> >> > configuration
>> >> >> > file to populate EXCEL.
>> >> >> >
>> >> >> > Many thanks
>> >> >> > Richard
>> >> >> >
>> >> >> >
>> >> >>
>> >> >> --
>> >> >> View this message in context:
>> >> >>
>> >>
>> http://old.nabble.com/Populating-EXCEL-Templates-using-configuration-files-tp27741955p27745536.html
>> >> >>  Sent from the POI - User mailing list archive at Nabble.com.
>> >> >>
>> >> >>
>> >> >>
>> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> >> >> For additional commands, e-mail: user-help@poi.apache.org
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://old.nabble.com/Populating-EXCEL-Templates-using-configuration-files-tp27741955p27758922.html
>> >>  Sent from the POI - User mailing list archive at Nabble.com.
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> >> For additional commands, e-mail: user-help@poi.apache.org
>> >>
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Populating-EXCEL-Templates-using-configuration-files-tp27741955p27874088.html
>>  Sent from the POI - User mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/Populating-EXCEL-Templates-using-configuration-files-tp27741955p27877134.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message