poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew C. Oliver" <acoli...@apache.org>
Subject Re: I have an idea
Date Mon, 21 Jul 2003 16:24:32 GMT
Hi Robert, 

My preference is this:

1. Low level Java APIs (primarily for us)
2. High level Java APIs (for users)
3. Low level XML-transform (generator/serializer) closely coupled to the
4. XSLT <-> The Common Format

This approach doesn't preclude what you're talking about, I think it
actually enables it.

I've come to this after working with the HSSF Serializer for Cocoon where we
took a "Common Format -> low level format" approach.

I'd *like* to think that we could make the binary format irrelevant, but its
not because of the capability difference, granularity difference and etc.
Its the same old problem with AWT.

Take a look at what we did with the HSSF Cocoon Serializer.


We used the Gnumeric XML format.  It seemed like a good approach.  There was
no OpenOffice.org XML format at the time we started, and by the time we
finished the OOo format was still very cery fluid (prior to 1.0).

Unfortunately, the format didn't exactly match Excel's capabilities.
Gnumeric can do things Excel can't.  It does styles in a completely
different way that is not easy to match to Excel's and Excel can do things
Gnumeric can't.  Overall the Gnumeric way is an improvement on Excel in most
instances, but that actually makes things more problematic.  It makes things
rather lossy as well.  (especially with styling/formatting)

Now the application developer could still work with the Common Format...
There would just be ONE XSLT per format.

Quattro (I didn't know that was still around!!) -> Low Level -> QXML -> XSLT
-> TCF -> XSLT -> QXML -> Low Level -> Quatro

Excel -> Low Level -> HSSFXML -> XSLT -> TCF -> XSLT -> HSSFXML -> Low
-> Excel

What I'm actually talking about is taking the low level "primitives" if you
will.  For HSSF these are called records:

And creating some kind of XML binding system for them.  We might even be
able to do this dynamically.  Thus get XML for free.  As the format evolves,
so does the XML capability.

We are getting ahead of ourselves.  Regardless of approach 1/2 have to be

It might be a good idea to start approaching the board about the
fileformats.apache.org idea...  However, As I understand it they're all a
bit busy at the moment...  So maybe in a few weeks.


On 7/21/03 11:22 AM, "robert_weir@us.ibm.com" <robert_weir@us.ibm.com>

> Andy said:
>>> To me the vocabulary of the XML is practically irrelevant provided the
>>> format is closely coupled with the binary format, you can always count
> on
>>> XSLT to make a transformation.
> That's true and it is one way of doing it.  Another way is to have the XML
> format be independent of the underlying binary format.  That's pretty much
> what OpenOffice did with their formats.  They're not just record dumps of
> Office into XML.  They tried to make it be independent of any specific
> office suite.  So, in theory, the OpenOffice XML could come from Excel,
> OpenOffice, 123, Quattro Pro, or even be created on the fly from a web
> service without any real document.  I think there's great power in that.
> Instead of making the XML format irrelevant, it makes the binary format
> irrelevant.
> In the end you probably have it both ways -- a lower-level API specific to
> a given binary format.  That is used directly for projects where
> performance is of primary importance.  Then, have a higher level project
> of XML readers and writers that adapt that API so some (hopefully)
> standards-based XML format.  The application developer would then work
> more at that level.
> One step at a time...
> -Rob

Andrew C. Oliver
Custom enhancements and Commercial Implementation for Jakarta POI

For Java and Excel, Got POI?

View raw message