cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [OT] - About MS published schemas
Date Tue, 25 Nov 2003 16:20:13 GMT

On 24 Nov 2003, at 23:02, Joerg Heinicke wrote:

> On 19.11.2003 03:43, Antonio Gallardo wrote:
>
>> Hi:
>> I found this interesting article about the recent MS published 
>> schemas.
>> And I like to share it with the rest of the community:
>> http://www.theregister.co.uk/content/4/34045.html
>
> What a frustrating point of view! But I fear to much will be true:
>
> ... there's plenty of room in the specification for binary data or 
> what Microsoft calls "arbitary schema". People forget that the X in 
> XML is for extensible.
>
> As Mike Champion asked here, "What is the point of storing data in XML
> if the schema is so hideous and proprietary than no one can use it
> without proprietary API support? What advantages does WordML have over
> the HTML-like stuff that current versions of Word generate on request?
> At least you can tidy.exe the HTML-like stuff into standard XML, but
> what can you do with WordML except load it into Word...unless of course
> you are an XSLT uber-geek?"

Look at this from this angle: the POI project is spending thousands of 
man-hours to figure out the binary formats that office uses just to get 
out with some easily parsable data. That data will have to be marked-up 
in some ways anyway and I wouldn't want POI to do, say, semantic schema 
transformation to docbook, for example.

so, at the end, if you buy a license for 2003, you are, in fact, buying 
what POI is trying to do anyway.

If you have equations or weird OLE stuff in your document, would you 
really be able to do anything with it even if it wasn't binary stuff? 
we wouldn't have support for mathml anyway.

I think that Word2003 is going to be a big issue for roundtripping 
small documents in a CMS: if Word2003 allows for "read only" styles, 
the issue of real-life semantic markup for document fragments is almost 
solved.

So, let's move on.

--
Stefano.

Mime
View raw message