cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Norman Walsh <>
Subject Re: Documentation grammars, was:[Re: [RT] latest wonderings around W3C land and surroundings]
Date Sat, 01 Apr 2000 01:01:35 GMT
/ Mike Pogue <> was heard to say:
| Stefano Mazzocchi wrote:

[Mike, try not to trim out the attributions, ok?]

| > What you're saying about the XML world is equivalent to the following in
| > the Java world:
| > 
| > "my parser and your parser will surely parse a document but they will
| > have both similarities and differences... so a common API doesn't make
| > sense."
| > 
| > Since I _know_ you believe in common and open APIs, I cannot understand
| > while you don't believe in common "interfaces" for the XML world.
| > Please, I want to understand!
| I think I can say it clearest by saying that DocBook is a "Software
| Documentation Markup Language", it's not a "Shoe Documentation Markup
| Language" or a "Open Source XML Parser Documentation Markup Language".  

DocBook has always maintained that its domain is computer
software and hardware documentation. (I suppose it's not a
narrow scope, but the assertion that we have a scope keeps us
honest. Presented with new feature requests, we have
occasionally declined perfectly intelligent proposals on the
basis that they didn't meet a need in our domain.)

| And, it's not clear to me that all things in Shoe Documentation Markup
| Language can be represented in Software Documentation Markup Language (I
| think there's likely to be an information loss.

No argument about shoes, but you seem to be implying that "Open
Source XML Parser Documentation Markup Language" is not computer
software documentation, and I don't get that at all.

| I was voted down by technical writers and engineers alike, because in a
| contest
| between DocBook and HTML, HTML wins.  I have to agree that they have a
| point!  :-)

They definitely have a point. But what exactly is it? I think it
amounts to something along these lines: we're busy, change is
hard, documentation is rarely a priority, we know HTML, we don't
know DocBook, we want to use HTML.

I sympathize.

I would like to make a different point. The Apache Project is
undertaking a large, complex project involving at least six
major products (Xerces in two languages, Xalan in two languages,
Cocoon, and FOP--and those are just the ones I know about).
These projects are all related in some respects. At the end of
the day, there will be a lot of documentation in this project.

That documentation will need to be presented in a variety of
ways (at least on paper and on full-featured browsers, but
probably also on PDAs and electronic books, and speach readers,
and braille devices, and maybe even cell phones).

If no effort is made to develop a common vocabulary for this
documentation, things are not going to be very pleasant for the
authors and publishers of this information. HTML does not have
rich enough semantics to capture the information that you're
going to want to have captured for flexible rendering and
next-generation search and indexing tools.

Drawing on many years experience, I assert that in the long run,
it's better to train authors now, deal with the pain and anguish
that change causes, and establish a common vocabulary now.

That common vocabulary does not have to be a single schema, and
it certainly doesn't have to be DocBook, although I am inclined
to think that DocBook is the obvious and correct choice, a small
set of schemas makes perfect sense. But make it rich enough to
get the job done tomorrow or you'll find yourself with thousands
of pages of documentation to drag up hill. And dragging
documentation up hill is a painful process that requires
significant human intervention.

| Right, so stick with the HTML DTD, for those tags that can do so!  ;-)
| ;-)

Sure. But what do you do for HTML thags that "almost do". Do you
have add <procedure> tag, or do you accept that you can't
distinguish between procedures and numbered lists that aren't
procedures. Do you add markup for class and function synopses,
or do you rely on some collection of <sup>, <sub>, <br>, and
<font> tags to get the job done.

Authors love to twiddle with documents, getting the formatting
"just so". (I'm guilty of it too, so I know I'm throwing stones
in a glass house.) But that is not the author's job, it
distracts them from getting the words down, and it ties the
information to the presentation medium they worked in.

If you want to stick with HTML, I strongly encourage you to rip
all the presentational markup out.

| DocBook strikes me as "quiet, reverent cathedral-building".  (quotes are
| from the Cathedral and the Bazaar).

LOL! You weren't at some of the early meetings, I can tell<wink/>.

| Sorry, didn't mean it that way.  IMHO, both Stylebook and Cocoon are
| dangerously
| close to having an "Official DTD".  My evidence?  How many alternate
| DTD's do we 
| have?  Few. How many site styles do we have?  Few.  And, all the styles
| that we do 
| have look pretty much alike (hierarchical nav bar on the left,
| composited banner at 
| the top, copyright at the bottom, etc.)  I'd say that's strong evidence
| that we don't have 
| ENOUGH diversity, ENOUGH randomness, and ENOUGH wasted energy, and we
| probably have 
| too much Cathedral building going on here.  

Are you arguing that we should encourage every project and every
team within every project, to customize the DTD to suit their
particular needs?

On the surface, perhaps that sounds like a good idea, because we
imagine that they will all produce sound, logical "perfect" DTDs
for thier domain. But I think it's a recipe for disaster.

Please accept the following statement as my humble opinion and
not as hubris or arrogance. Schema design is not self-evident,
it is a skill acquired through study and experience. Project
teams are not likely to have much experience in this area and
are likely to do the job badly. It would be much, much better if
they could be convinced to use a common vocabulary for as much
as was reasonable. There will naturally be some need for
customization in some projects, but it ought to be in the 20%
range, not the 80% range. IMHO.

| By the way, Zope appears to have the same exact problem.  All the Zope
| sites look very similar.
| What does that tell us about Zope?  (And, by analogy, Stylebook and
| Cocoon?)

Why on *earth* is it bad for sites to have a common interface to
documentation? Why would one intentionally confuse the poor
beleagured reader? By the same argument, we could propose that
some books should be printed front to back, some back to front,
some with all the even pages first and all the odd pages last,
etc., because all of our (our in an inclusive cultural sense, I
recognize that books vary across cultural and language
boundries) books for the last 500 years have looked about the
same and that must be a bad thing.

To which I reply, "Huh?"

                                        Be seeing you,

Norman Walsh <>      | Any bureaucracy reorganized to                 | enhance efficiency is
                                   | indistinguishable from its
                                   | predecessor.

View raw message