incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Fisher <dave2w...@comcast.net>
Subject Re: Doctype of websites
Date Thu, 15 Mar 2012 17:23:21 GMT
Hi Dennis,


On Mar 15, 2012, at 9:58 AM, Dennis E. Hamilton wrote:

> Here's my understanding of the situation with regard to DOCTYPE and how pages may be
assembled from parts prior to being stored (static) or delivered (dynamic) from the server.
> 
> If there are any tools that mechanically generate web page <body> content, partly
or in their entirety, you have to ensure that when the final page is served up from the server,
the result is compatible with some single DOCTYPE declaration.  I assume that the CMS is the
likely determining factor, since it will generally be designed to generate a particular grade
of HTML.  That includes the result following server-side includes as well.

Yes. The skeleton is in control and what needs adjustment, but it has to interpret many types,
some of which use UPPER CASE tags.


> 
> There is no reason that national-language choice should be the determining factor.  I
wonder if that has simply been that the different authoring communities had their own preferences,
perhaps related to agreements around authoring tools.

You don't need to wonder. That is what happened. I particularly like the Mongolian site: http://www.openoffice.org/mn/

The NLC sites are done in many different styles of HTML - there is no general conformance
to particular DOCTYPE like there is in say the "DE" site.


> 
> Likewise, the character encoding has to be the same throughout the served web page. 
I presume that is UTF8, since there are NL concerns and it is simply a good choice.  That
means the httpd setting ensure the proper MIME type with specific character setting is also
part of the response header.

There are exceptions. Some sites like Sinhalese (http://www.openoffice.org/si/.) do not use
UTF-8, but instead use "Thai". There are BODY Tags which are a to-do for insertion.

Of course, you might want Pastun / Pashto which *is* UTF-8 - http://www.openoffice.org/ps/

There really are a lot of NL sites in various stages.

> 
> The only way to be able to operate this in a sane way is to have it be the same for all
pages as delivered from the server.

Most likely it should always be the same. The question is whether conversion to a particular
DOCTYPE choice will break parts of the site. In that case we can use the ssi.mdtext trick.
NL sites are in the mix because that will be the determining factor.

So far the only content area that has a divergent ssi.mdtext is the api. I mention NL because
that is where the divergence exists. Have a look you will see great variation.

Regards,
Dave

>  There may be similar considerations for the Community Forums and the MediaWiki as well.
 Those choices can be resolved independently but the DOCTYPE declarations should be accurate
at all times, of course.  That is not always the case on many sites.
> 
> - Dennis
> 
>  PS: I'm ignoring the HTML 4.01 vs XHTML 1.0 debate.  Going to HTML5 still requires a
decision whether it is done using the HTML or XML flavor.  No matter what the direction, the
problem is going to be how page assembly is done and which page-generating products have to
be accommodated.  Finally, it is important to have valid pages under whatever the DOCTYPE
is and also have a successful result with as many browsers and their users as possible.  It
might be more valuable to consider what it takes to make the pages adaptable on small-format
device browsers (i.e., smartphones and tablets) and pay close attention to accessibility requirements
than fuss about not-yet-approved HTML specifications.
> 
> - Dennis
> 
> -----Original Message-----
> From: Dave Fisher [mailto:dave2wave@comcast.net] 
> Sent: Thursday, March 15, 2012 08:33
> To: ooo-dev@incubator.apache.org
> Subject: Re: Doctype of websites
> 
> 
> On Mar 15, 2012, at 7:37 AM, Rob Weir wrote:
> 
>> On Thu, Mar 15, 2012 at 10:33 AM, Dave Fisher <dave2wave@comcast.net> wrote:
>>> 
>>> On Mar 15, 2012, at 12:22 AM, Regina Henschel wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Joe Schaefer schrieb:
>>>>>> ________________________________
>>>>>> From: Regina Henschel<rb.henschel@t-online.de>
>>>>>> To: ooo-dev@incubator.apache.org
>>>>>> Sent: Tuesday, March 13, 2012 5:31 PM
>>>>>> Subject: Re: Doctype of websites
>>>>>> 
>>>>>> Hi Joe,
>>>>>> 
>>>>>> Joe Schaefer schrieb:
>>>>>>> Those de.openoffice.org pages should redirect
>>>>>>> to www.openoffice.org/de pages, if not your
>>>>>>> DNS resolver is busted.
>>>>>> 
>>>>>> I had indeed set de.openoffice.org to 192.9.163.104. Removing it
makes
>>>>>> redirecting work.
>>>>>> 
>>>>>> That means the pages at de.openoffice.org had been the original ones,
>>>>>> but will be deleted in near future. They had been imported to
>>>>>> ooo-site.apache.org/de and here they have got a different doctype.
Right?
>>>>> 
>>>>> 
>>>>> 
>>>>> Well sort of. If you look at the actual document on the site
>>>>> you will probably find it contains an XHTML doctype even now.
>>>>> The thing is that the CMS build system as Dave has designed it
>>>>> will strip most of the header matter out of the file and replace
>>>>> it with a generic one supplied by a template.
>>>>> 
>>>>> 
>>>>>> 
>>>>>>   If that's not the problem
>>>>>>> then you need to refresh your pages as they
>>>>>>> are identical on the server.
>>>>>>> 
>>>>>>> As to why the doctype is different from the original
>>>>>>> document, that's probably due to the way Dave worked
>>>>>>> out the templates for the site.  If we need to scrape
>>>>>>> the doctype out of each individual page that will require
>>>>>>> some perl coding work, some templating work,
>>>>>>> and another sledgehammer style commit- ie not something
>>>>>>> to be taken lightly.
>>>>>> 
>>>>>> Our pages had been XHTML with all the differences to HTML. And we
tried
>>>>>> to produce valid pages (including W3C check button). It is not
>>>>>> impossible to change the pages and it can be done bit by bit while
>>>>>> reviewing the pages. But the aim should be clear.
>>>>> 
>>>>> 
>>>>> Well I can't advise you how to proceed from here, only point out
>>>>> that there is some impedance mismatch between how your site builds
>>>>> work and what's actually in these documents.  The choice seems
>>>>> to be either standardize all the documents on a common doctype
>>>>> or have the perl code pull the doctype out of the original document
>>>>> if it exists and pass it along to the template as an argument.
>>>>> 
>>>>> 
>>>>> You might even be better off just not supplying a doctype at all
>>>>> and letting the browser figure it out.  Up to you folks.
>>>>> 
>>>> 
>>>> If we want valid pages, a common doctype is needed because the inserted part
has to be written in a way, that it fits this doctype. For example you need for the feather-logo
an <img .../> element in XHTML and in HTML only <img ...>. So I think we need
to agree on one doctype.
>>>> 
>>>> Is it possible to count, how many pages of all are actually having an XHTML
doctype? (I'm not familiar with command line.)
>>>> 
>>>> Kind regards
>>>> Regina
>>>> 
>>>> P.S. The feather img-Element is missing the alt-attribute.
>>> 
>>> I have been looking into this. In general the skeleton is the non-compliant part
and is what should be changed. However there are many of the NLC sites that are very much
HTML.
>>> 
>>> One more sledgehammer will happen ... but planning needs to be careful.
>>> 
>> 
>> What if we went subdomain by subdomain and ran HTML Tidy on the
>> content to coerce it to a single doctype. Would that butcher things?
> 
> We have a file called content/brand.mdtext that controls the branding language and logo
for each page. 
> 
> In templates we have templates/ssi.mdtext and templates/api/ssi.mdtext
> 
> David-Fishers-MacBook-Air:templates dave$ more ssi.mdtext
> brand:  /brand.html
> footer: /footer.html
> topnav: /topnav.html
> home:           home
> 
> I think that ssi.mdtext should add a line like:
> 
> doctype:	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
> 
> And if "mn" needs a different treatment:
> 
> templates/mn/ssi.mdtext
> brand:  /mn/brand.html
> footer: /footer.html
> topnav: /mn//topnav.html
> home:           home
> doctype: 
> 
> This fits the NL plan. I want to avoid divergent skeleton.html files, and it may be the
case that some sections will want an xhtml skeleton while others get a html.
> 
> I still intend to avoid changing every file.
> 
> I've $job to pay attention to until late today ... sorry that I'm dribbling out these
plans bit by bit.
> 
> Regards,
> Dave
> 
> 
>> 
>> -Rob
>> 
>>> Regards,
>>> Dave
>>> 
>>> 
> 


Mime
View raw message