corinthia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis E. Hamilton" <dennis.hamil...@acm.org>
Subject RE: What Google does to odf documents
Date Wed, 10 Jun 2015 20:57:22 GMT
Thanks, Ian.

This is an important test procedure -- taking a test document and seeing not only how well
it is accepted but also what is re-saved by a particular processor.

The Google Docs approach seems pretty ridiculous.  It has basically broken up the "Hello world"
paragraph to have a paragraph style and to also have a differently-named automatic text style
on each fragment: "Hello", " ", and "world".  That is pretty ridiculous.

So these forms of documents will be encountered in the wild when a Google Doc is exported
for interchange as an ODF document.  The bloat should be quite remarkable.  Not just in the
text but in the definitions of automatic styles.

 - Dennis

-----Original Message-----
From: hammyau@gmail.com [mailto:hammyau@gmail.com] On Behalf Of Ian C
Sent: Wednesday, June 10, 2015 04:09
To: dev
Subject: What Google does to odf documents

Hi All,

one of the things my tool does is compare the structure of documents.
Different version of the same one as a user adds etc.

I took a simple "Hello World" document and stored it in Google docs. I then
downloaded it back again, no edits,  I then compared the two.

They are radically different. Just consider the document body.
Original....
    <office:body>
        <office:text>
            <text:sequence-decls>
                <text:sequence-decl text:display-outline-level="0"
                    text:name="Illustration" />
                <text:sequence-decl text:display-outline-level="0"
                    text:name="Table" />
                <text:sequence-decl text:display-outline-level="0"
                    text:name="Text" />
                <text:sequence-decl text:display-outline-level="0"
                    text:name="Drawing" />
            </text:sequence-decls>
            <text:p text:style-name="Text_20_body">Hello world </text:p>
        </office:text>
    </office:body>

When downloaded.
    <office:body>
        <office:text>
            <text:p text:style-name="P1">
                <text:span text:style-name="T1_1">Hello</text:span>
                <text:span text:style-name="T1_2">
                    <text:s />
                </text:span>
                <text:span text:style-name="T1_3">world</text:span>
            </text:p>
        </office:text>
    </office:body>

It lost the text-sequence-decls... no harm there. Not really sure what they
were. But look at the simple text paragraph. It gets blown out to a span
around each word with its own style! Even the space between the words has
its own style!

I'm sure there is some smart reason for this. I don't understand what it is.

Let's hope we can do a better job with the round trip of a document in
Corinthia,
Then again maybe we will discover that is what we have to do?

-- 
Cheers,

Ian C


Mime
View raw message