corinthia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Franz de Copenhague <franzdecopenha...@outlook.com>
Subject RE: Git revert difficulty
Date Mon, 27 Apr 2015 12:18:48 GMT


> From: pmkelly@apache.org
> Subject: Re: Git revert difficulty
> Date: Mon, 27 Apr 2015 11:55:46 +0700
> To: dev@corinthia.incubator.apache.org
> 
> > On 27 Apr 2015, at 11:27 am, Franz de Copenhague <franz@apache.org> wrote:
> > 
> > I have a question regarding to the ODFTextConverter to HTML. ODF only use span tags
and style names to define text content and formats like bold, italic, strike, etc as opposite
to OOXML that uses run texts tags with separate rPr tags for inline formatting like bold,
italic, strike, etc. For me makes sense how DOCXConverter converts text formatting inline
style like style="font-weight: bold" or style="font-style: italic".
> > 
> > So, considering that ODF only uses style names. What is your text formatting approach
for HTML generation from ODF documents?
> 
> Excellent question :)
> 
> ODF actually has two types of styles - normal and automatic. The latter aren’t really
styles in the traditional sense, but rather a compact way of representing direct formatting
that avoids repetition. As far as what a user sees, only the former type are actually “styles”
(like Heading 1). When you apply direct formatting to a piece of text, behind the scenes and
ODF creates and deletes automatic styles to represent the necessary formatting information
(hence the name); they are never exposed to the user.
> 
> I actually think this is a poor design choice in the spec, because it causes confusion.
As far as I’m concerned, “automatic” styles aren’t styles at all, because the whole
point of styles is that they’re explicitly defined by the user (or provided as defaults
by an application), and are distinct from direct formatting. However, this is the terminology
used in the spec.
> 
> When translating to HTML, I think it would be best to translate all automatic styles
to direct formatting. The reason is twofold: (1) the Word filter (and likely to others to
come) has the distinct concepts of styles and direct formatting, and (2) the editing code
works on this model as well. So if you encounter an element in content.xml that has a style
associated with it, check if that style is a normal style or automatic. In the former case,
set the “class” attribute of the HTML element you create to the name of the style, and
(separately) ensure that the style is present in the CSS stylesheet included in the HTML file.
In the latter case, set the “style” attribute of the HTML to the serialised CSS properties
generated from the automatic style.
> 
> So, for example, consider the following:
> 
> <text:p text:style-name=“Quote">Some sample text</text:p>
> 
> where Quote is a normal style defined in styles.xml. This would become
> 
> <p class=“Quote”>Some sample text</p>
> 
> Now consider the following:
> 
> <text:span text:style-name="T1">Bold</text:span>
> 
> where T1 is an automatic style defined in either content.xml or styles.xml, e.g.:
> 
> <office:automatic-styles>
>   <style:style style:name="T1" style:family="text">
>     <style:text-properties fo:font-weight="bold" style:font-weight-asian="bold" style:font-weight-complex="bold"/>
>   </style:style>
>   ...
> </office:automatic-styles>
> 
> because we know that T1 is an automatic style, we’d produce the following HTML:
> 
> <span style=“font-weight: bold”>Bold</span>
> 
> —
> Dr Peter M. Kelly
> pmkelly@apache.org
> 
> PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
> (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)
> 

I agree with your proposal.

This is good because the client editor doesn't care about what document format did generated
the HTML, and the editor only must know that the text formatting options are defined as inline
CSS styles and the business logic resides in DocFormats GET/PUT api.

Franz




 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message