forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicola Ken Barozzi" <nicola...@apache.org>
Subject Re: Wish/Dream List for Document DTD
Date Tue, 21 May 2002 12:58:45 GMT

----- Original Message -----
From: "Piroumian Konstantin" <KPiroumian@protek.com>
To: <forrest-dev@xml.apache.org>
Sent: Tuesday, May 21, 2002 1:48 PM
Subject: RE: Wish/Dream List for Document DTD


> > From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]
> > From: "Piroumian Konstantin" <KPiroumian@protek.com>
> >
> > > Hi all!
> > >
> > > This is the summary of the discussion between me and Steven on the
> > additions
> > > to the document format. Ideas were inspired by my rewriting
> > of some Cocoon
> > > docs into v11 format. So here they are:
> > >
> > > 1. Special formatting for text.
> >
> > Aaaarg. We need semantics!
>
> [OT] Btw, when I've suggested to use a semantical markup for the site
> front-page (category/software/project etc.) and a more semantical markup
for
> documentation (book/chapter/section/page) then I was told that it's better
> to stay closer to XHTML and document markup.

XHTML has both semantics and style elements.
The trick is using the first type ones.

As per your proposal, the point is not semantics vs presentation, but strong
vs loose semantics.

> So, either we will have _heterogeneous semantics_ or add a few styleing
> elements.

Since we are describing a *general* document structure, we should abstract
as much as possible the concepts.
Which has nothing to do with style.

> > > 1.1 Text highlighting element (as it was proposed in the
> > > resources/layout/xml.apache.org/page.html). The name of the
> > element is not
> > > defined yet (proposed 'hi' was considered not very good).
> > Other proposals
> > > are welcome. This formatting can be used to catch readers
> > attention to
> > > particular parts of the text.
> >
> > <hili/> <highlight/> <hi>
>
> What about a more semantical element? Say: <output /> - result of the
> program output, <warn /> (inline warning), etc.?

Highlight *is* semantical. It means that the author wanted to highlight that
aspect in particular.
It's general, and different from "emph", since the latter gives emphasis,
which has a different meaning.

> > > 1.2 Strike out text formatting. Proposal: '<strike>' (as HTML). This
> > > can be used to display deprecated class names or so.
> >
> > <deprecated/>
>
> It seems to me too Java specific term. Ther other time will need also
<since
> />, <throws />, <return /> etc. ;)

It's not java specific, it means that something is in "phase-out" stage.
The fact that Java uses it for Javadocs is a coincidence.

We could generalize the meaning to "versioning" tag system, that gives a
special age flavor to the content.

Maybe a state="new|wip|stable|deprecated|old" tag to add to some elements.

It could be used by an automatic system that regularly can also nag authors:
1. if it's new the skin can highlight it to the reader
2. if it's a wip (work in progress), the skin could show a lateral line.
When it stays in that condition for a determinated time, the authors can be
nagged.
3. deprecated can be accompanied by a small automatic notice.
4. old can be striked out. Nagging here too.

This idea of using documents in conjunction with automatic systems came to
us on poi-dev:
use an automatic system that adds checkstyle or other metrics directly in
the file as special javadoc tags.


> > > 1.3 Advanced version of the 1.1: special styling for some elements
> > > (table, th, tr, td, p, simple text, etc.). This can be
> > achieved by adding
> > > 'class' to common attributes (predefined class values can
> > have semantical
> > > meaning, like: 'deprecated' or 'highlight'). I need it to
> > mark some markup
> > > elements of i18n transformer as deprecated (there is a table of the
> > elements
> > > with links to details and it'd be fine to color the rows
> > with deprecated
> > > elements in a different way and strikeout the names of elements).
> >
> > Maybe we can logically group the "inline" elements like <emph>,
> > <deprecated>, etc and have the class attribute require one
> > from this list.
>
> It sounds like <span class="" /> or <div class="" />. No?

A bit like it, yes, but with the possible attributes like the inline tags.

For example, in the dtd:

    <!-- Phrase Markup -->

    <!-- Strong (typically bold) -->
    <!ELEMENT strong (%text;|code)*>
    <!ATTLIST strong %common.att;>

    <!-- Emphasis (typically italic) -->
    <!ELEMENT em (%text;|code)*>
    <!ATTLIST em %common.att;>

    <!-- Code (typically monospaced) -->
    <!ELEMENT code (%text;)>
    <!ATTLIST code %common.att;>

    <!-- Superscript (typically smaller and higher) -->
    <!ELEMENT sup (%text;)>
    <!ATTLIST sup %common.att;>

    <!-- Subscript (typically smaller and lower) -->
    <!ELEMENT sub (%text;)>
    <!ATTLIST sub %common.att;>

It could be that it has an attribute like:
inline="strong|em|highlight|code|sup|sub"

We could also do a tag like
<meaning inline="strong|em|highlight|code|sup|sub"> tag for consistency, but
it gets harder to read.

On the other hand, it's just simpler to do:
<td><code>  </code></td>

can achieve the same effect of:
<td inline="code> </td>

if the skin is smart enough.

> > > 1.4 Custom styles for particular document. This can be achieved by
> > > adding a reference to a CSS file from inside of the
> > document. This will
> > also
> > > allow to have custom styles for elements with 'id' and usage of user
> > defined
> > > styles. Example.: I want to display a list of supported
> > locales using one
> > > coloring for locales with the same language, but a
> > different country code
> > > (en_GB, en_US). I could use <li id="en">US Locale</li> and
> > <li id="en">UK
> > > locale</li>.
> >
> > Then you should abstract the semantical meaning.
> > Maybe something like "grouping".
>
> Using 'ID'd for elements can as semantical as grouping. I don't quite
> understand the idea of grouping. How would you assign a special style for
a
> particular group in that case?

Simply the skin has a generic way of handling groups, like a chart can
handle colors to apply to different values.

> > > Notice: I realize that this breaks the semantical purpose
> > of the DTD, but
> > > it's hard to beleive that having only semantical markup is enough to
> > satisfy
> > > all the possible needs for real documents.
> >
> > Not hard. You just need to add new meanings or have a more intelligent
> > stylesheet.
>
> Adding new meanings will lead to element/attribute additions, which is,
IMO,
> also as bad as having not semantical markup.

Not if you are generic enough.

If I say "small-white-horse", it's too specific, so I should say "horse", or
"animal" instead.
No need of saying how it should be shown.

> Another sample of what I need: in the mentioned table, where I list all
the
> markup elements I have three columns:
> element | description | attributes
>
> How can I indicate that the first column should be <td nowrap />, cause I
> don't want the element names (like i18n:date-time) be wrapped. Should I
> provide a special stylesheet for that only table? In thus in every case?
The
> semantical attribute can be something like <td
> content-type="xml-element-name" />. Do you like it? Me not. But <td
> class="xml-element-name" /> looks much better and it's more generic.

In this case, why is wrapping generally (not)needed?
What is the *purpose* of it?

We can think of a generic way of handling this, like:
<para content="text">
<para content="numeric">
<para content="date">

etc...

It keeps the semantics, ie the meaning.

If it needs that formatting because it's a date, well then just say it's a
date.

> > > 2. Links/References
> > >
> > > 2.1 Links to resources that are not present at the build time break
> > > the build. Any good idea on how to solve this? Maybe a
> > matcher in sitemap
> > > that return nothing or something like: <document>Will be added
> > > later</document>>
> >
> > Currently link to dirs do not break the build, but only warn.
> > I think that this is ok, since it assumes that a dir is
> > managed by another
> > system.
>
> Another example: I've placed a link to the JavaDoc document for the
> trasnformer I am describing and it broke the build. If JavaDocs processing
> will be added to the sitemap then it will solve the problem, but it can
lead
> to full JavaDoc generation, isn't it?

Look how the krysalis-centipede documentation is done:
just link to href="mydir/javadocs/" and the problem is solved.

> > > 2.2 Jumps between document content. E.g.: in the list of supported
> > > markup elements for a transformer it is needed to have jumps to the
> > details
> > > for that elements. Currently I've used special anchors near
> > the section
> > > names with the details. Having some automagics here would be fine.
> >
> > Yes, we need this.
> > Automagic, how?
>
> Hm... Yes, fully automatic it can't be. But somebody proposed to use 'id's
> and place references using id() function. E.g.:
>
> <section id="intro" title="Introduction" />
> ...
> somewhere else <jump href="#id('intro')" />.

I like it. Maybe make a <jump href="." toid="intro">... (thinking out loud)

> It's a little better compared to what I've done using <anchor />s near the
> section. Maybe using full XPath also would do:
>
> <jump href="#section[@title='Introduction']" />. But 'id' semantically is
> more correct.

Nice too, but a bit intimidating 4 users.

> >
> > > 2.3 XPointer-ish syntax for referencing other xdocs/content (this
> > > one is really an RT, didn't thought how this can be performed).
> >
> > My contribution to a RT: links to a topicmap entry, and every
> > file contains
> > refs to a topic.

Start From: "Steven Noels" <stevenn@outerthought.org>
> XPointer/XLink would be "cool" (tm). Having an XLinkTransformer or the
> like would be wonderful. Our long-due book.xml-nuker could possibly
> maintain an automagic LinkBase. TopicMaps seem like one step too far to
> me currently.
End From: "Steven Noels"

I agree with the short term solution.

I made a semantical search engine in java, that could be used in part for
this: you assign words *and* topics to pages, with a certain relevance. So
users can earch for words in an particular context. But this is for the
future ;-)

> > > 3. Special content/formatting
> > >
> > > 3.1 It would be fine to provide syntax highlighting for the <source>
> > > elements. Say: <source type="xml" /> or <source type="java"
> > /> (maybe
> > <code>
> > > element also can concidered). It obvious, that source is
> > better read when
> > > it's colored in a reader-friendly manner. Some ideas on
> > implementation:
> > > - special transformer (like FragmentExtractorTransformer)
> > > that will process <source> elements and present them in
> > some intermediate
> > > XML (that can be trasnformed to HTML using XSLT) or
> > immediately transform
> > to
> > > the result format using a parameter
> > > - use something like <cinclude:include
> > > src="cocoon:/format/sample.xml/?type=source" />, but this
> > will require
> > some
> > > transformer to pass the actual source text to a the
> > formatting pipeline.
> > > - no other idea yet: yours are welcome
> >
> > I would put the parameter as you suggest, but leave the
> > formatting to the
> > skin.
>
> The skinning system should use a special transformer, otherwise how would
it
> format a CDATA section? Look:
> <source><![CDATA[
> <para title="first" name="article"  i18n:attr="title name">
> <i18n:text>This text will be translated.</i18n:text>
> </para>
> ]]></source>
>
> It should be a transformer that gets the content of the <source/> element,
> call a special pipeline, then output the result into the processed*)
> document.
>
> *) I mean the document that is being processed. How would it be in correct
> English?

maybe: "It should be a transformer that gets the content of the <source/>
element, call a special pipeline, then output the result into the  document
being processed." (it's clear, anyway ;-)

Yes, this is a possible way. But we need Cocoon blocks to handle it
transparently it seems :-(

As for now, we can have a static preprocessing phase with an ant task that
does the formatting before the build-deployment.

> > > 4. Form elements.
> > >
> > > 4.1. E.g.: search and feedback forms, etc. This can be used even now
> > > using 'action' attributes in forms referencing external
> > dynamic resources
> > > (Google, forums, etc.).
> >
> > Mail forms.
>
> Exactly! Currently we can already place a feedback form at the bottom of
> every document like this:
>
> <form action="mailto:forrest-dev@xml.apache.org?Subject=[Docs]
> {document-id}">...</form>
>
> What do you think?

Maybe a link to a feedback page with the form is better, since we also need
to prevent too much spamming ;-)

Ok, add a feedback page, add a @feedback.mail@ filter in the build and put
it in the form page.

Or else we could have feedback from all projects using Forrest ;-)

--
Nicola Ken Barozzi                   nicolaken@apache.org
            - verba volant, scripta manent -
   (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Mime
View raw message