forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <>
Subject Re: Proposal for future direction of site.xml
Date Tue, 11 Feb 2003 05:32:52 GMT
On Mon, Feb 10, 2003 at 09:58:50PM +0100, Ed Steenhoek wrote:
> To follow up on what was part of the menu truncation discussion, I've 
> started this separate thread.

Hi Ed,

Some nice ideas here.  Thanks for taking time to write it all out.

Some general comments while I digest it..

 - site.xml was originally designed primarily as a way to *identify*
   pages, so the id (an xpath) could later be used in <link
   href="site:<id>"> linking.  Menu generation was a nice side-effect.
   Node identification is a requirement of any site.xml format.  Perhaps
   there should be id='...' attributes for each <item> and <folder>?

 - In Cocoon, "sitemap" has a very precise meaning: the sitemap.xmap file
   which defines all the pipelines; so "menu" would be a less confusing
   word here.  The standard Forrest sitemap is at
   src/resources/conf/sitemap.xmap.  Pretty much everything in Forrest
   revolves around it.

 - The idea of using XPath expressions in site.xml is neat, but I don't
   think it would work with our current architecture.  For those who
   missed it, one could define page 'views' of a single XML file with:

<folder label="manual" path="manual/">
  <item label="contents" href="overview.xml/chapter[@label='contents']"/>
  <item label="chapter 1" href="overview.xml/chapter[@label='chapter 1']"/>
  <item label="chapter 2" href="overview.xml/chapter[@label='chapter 2']"/>
  <item label="chapter 3" href="overview.xml/chapter[@label='chapter 3']"/>

    But, should site.xml @href's be dealing directly with source XML like
    this?  Ultimately, there has to be a sitemap pipeline transforming, say,
    "overview.xml/chapter[@label='contents']", into a HTML page with a
    URL.  I think site.xml should be dealing with the URI space defined
    by the Cocoon sitemap, not the raw source XML.  In the above snippet,
    how do I indicate if I want PDF or HTML output?

 - Using @default-item to specify which subnode is the default makes
   sense.  If omitted, I presume the first entry would become the

 - I don't think wildcard expansion would work, eg:

  <folder label="subscriber-articles" path="articles/2003/">
     <item label="January" href="200301*"/>
     <item label="February" href="200302*"/>

   This would require knowledge of all possible sitemap outputs, given
   the current sources.

This would all probably be implementable if we had a dynamically
generated sitemap, rather than the current static text one.  Perhaps
that's a good long-term direction for Forrest to take anyway.


> The background:
> >>[X] The menu should be limited to files below the current directory
> >>[ ] The menu should display all files in the site
> >> 
> >>On top of this I take the gamble as a newbie to add another item
> >>into the discussion: if menu's are truncated then shouldn't 
> >>tabs.xml be integrated into site.xml?
> >> 
> >>If tabs.xml is needed somewhere during the generation process it
> >>should be able to generate it out of site.xml. Doing so will reduce
> >>the number of places where navigation information is maintained.
> >
> > 
> >You're right, tabs could to be merged into site.xml in the future.
> >I have this niggling feeling though, that site.xml is mixing 
> >concerns:
> > 
> >1) Raw data, that could be generated directly from the filesystem
> >   (directory structure, label and href, ie 90% of site.xml)
> >2) Classification of pages. Eg <about label="About"> around a block
> >   of entries. This is semantics, not structure, so probably ought
> >   to be separate.
> >3) Then again, we need to create 'views' of site.xml for each page.
> >   Currently the 'view' is hardcoded: everything (0.3) or
> >   everything-below-current (0.3.1).  Would be nice to make this 
> >   more fine-grained.
> > 
> >So there's lots that should happen with site.xml.  I think tabs fit
> >into 3), a view of the data.  Or perhaps 2).  Anyway, I'm a bit
> >reluctant to throw anything more into site.xml until 1) is 
> >automated, and the relationships become clearer.
> > 
> > --Jeff
> I don't know if I agree with the extend of automation you're looking 
> for. I do agree that things have to become clearer before making 
> further changes and there it might be time to get a common 
> understanding of the direction into which site.xml should develop. In 
> the remainder of this mail I give my view on this.
> In the first steps of this process I ignore the presentation issues.
> To me site.xml should be a file that contains at least one primary 
> hierarchy of navigation items and optionally one or more secondary 
> hierarchies. Such a hierarchy is a mixture of containers and nodes.
> The main difference between containers and nodes being that nodes are 
> not parent to one or more children where containers are.
> The order of the nodes and containers is implicitly their position in 
> the hierarchy. This is important because this is something that imho 
> can not be retrieved from the filesystem.
> Secondary hierarchies can be used to define the books, as an 
> alternative view on the same information as is part of the primary 
> hierarchy or for external references.
> Please note: the usage of external references as they are now in 
> site.xml is not yet fully clear to me. If it is the intend to let 
> site.xml be the place to maintain all external references and use 
> only semantic links from all source documents, then the concept if ok 
> but I'll doubt if this will remain userfriendly if a site becomes 
> larger and larger. Time will tell.
> The next important step is to understand what the meaning of tabs 
> are. I think that there are 2 possibilities. Tabs could be:
> 1) all (primary and alternative) hierarchies
> 2) the top level containers of the primary hierarchy
> Option 1 gives much more flexibility because it does includes the 2nd 
> option (that is if you would split the primary hierarchy at top level 
> containers into multiple hierarchies) but allows on the other hand 
> the creation of alternative ordering of the primary hierarchy next to 
> complete independent hierarchies. As mentioned these could include 
> 'books'.
> Sometimes an example tells it all:
> + main-sitemap
>   + container a
>     + container b
>       - node 1
>       - node 2
>     - node 3
>   - node 4
> + alt-sitemap-1
>   + container b
>     + container a
>       - node 1
>       - node 2
>       - node 3
>       - node 4
> + alt-sitemap-2
>   + container c
>     - node 5
>     - node 6
>   + container d
>     - node 7
>     - node 8
> + alt-sitemap-3
>   - node 1
>   - node 2
>   - node 3
>   - node 4
>   - node 5
>   - node 6
>   - node 7
>   - node 8
> To continue towards presentation, there are two more decisions to 
> make.
> First decision: Are we to have containers that are just that or are 
> they a node at the same time? This is important because at some stage 
> a click on the container when rendered in e.g. HTML will lead to some 
> action.
> Up until now I have come across scenarios where containers are nodes 
> at the same time and - incase where they are not - implicitly inherit 
> node behavior form the first child node.
> The first can make the site.xml become difficult to read and 
> understand because of the additional attributes that are needed.
> The second - although more pure - is limited because of the implicit 
> coupling with the first node.
> To me a solution where a container is just that is best especially 
> when we include the ability to reference one of the child nodes as 
> the default node to be coupled with the container when applicable. 
> When presentation comes along this becomes clear I guess. Still again 
> an example:
> + main-sitemap
>   + container a def-node=node 3
>     + container b def-node=node-1
>       - node 1
>       - node 2
>     - node 3
>   - node 4
> + alt-sitemap-1
>   + container b def-node=node-1
>     + container a def-node=node-1
>       - node 1
>       - node 2
>       - node 3
>       - node 4
> Second decision: what are the node-types we will support?
> 1) URL referencing (e.g.
>    Requires information on target (same or second window)
> 2) Semantic referencing (e.g. site:about)
>    Including cross sitemap referencing and logical pointers like the
>    'home' of a sitemap
> 3) Relative referencing (e.g. index.html)
>    Requires that its container can specify relative folder
>    information
> 4) Directory directives (e.g. news*)
>    Requires some processing that expands this single node into as
>    many nodes as there are source files matching the expression
>    pattern
> 5) XPath referencing (e.g. somefile/element/element[@attrib='value'])
>    This one is particularly interesting because it can possibly be
>    used not only for getting a piece out of a larger document but
>    also breaking up a larger document in which case it will also be
>    expanding into multiple nodes like #section in the default seed.
> Are there more types? I think that the ihtml and ehtml as given in 
> the sample seed do fit into these types but if not that might be due 
> to me not having understood the peculiarities of them.
> Once we step into the presentation of all this we need to have access 
> to some information that allows to make the right decision on the 
> right time.
> For every node and container we need to know if it is visible in the 
> results of the build. This allows us to have node that are only 
> available during the build. Maybe 'active' is a better term because 
> visibility is also something to be controlled by CSS.
> For every node we need to know its target window and maybe other HTML 
> href parameters.
> A global 'property' is needed to specify where to get the relevant 
> information for the tabs. There are 3 indicators: single, multi and 
> none.
> Single indicates that only one hierarchy is used for the tabs and 
> from this hierarchy the top level folders are used to generate the 
> tabs. To prevent arbitrary selection of an hierarchy, this indicator 
> is accompanied with a reference to the hierarchy that is to be used.
> Multi indicates that all hierarchies are to be used to generate the 
> tabs. All in this sense are all hierarchies that are either 'sitemap' 
> or 'book'.
> None indicates that no tabs are to be generated.
> Of course labels and a source expression (in one of the 5 types as 
> specified above) are needed.
> For directory directives we need to specify the sort order.
> There are undoubtfully more parameters needed once all gets more 
> support.
> I have created a DTD and a sample XML file that contains the above 
> concept.
> Maybe this is just a beginning of a long discussion, maybe it's bad, 
> maybe it's already a large step into a good solution. Whatever it may 
> be, I think it is a good point to start even though I considder 
> myself still as a newbie to Forrest.
> Regards,
> Ed

Content-Description: Text from file 'site.dtd'
> <?xml version='1.0' encoding='UTF-8' ?>
> <!--Generated by XML Authority-->
> <!ELEMENT book (section+)>
> <!ATTLIST book  label CDATA  #REQUIRED >
> <!ELEMENT chapter EMPTY>
> <!ATTLIST chapter  label CDATA  #REQUIRED
>                      href  CDATA  #REQUIRED >
> <!ELEMENT external-refs (reference)>
> <!ELEMENT folder (item | folder)+>
> <!ATTLIST folder  label        CDATA  #REQUIRED
>                     default-item CDATA  #IMPLIED
>                     active        (true | false )  #IMPLIED
>                     path         CDATA  #IMPLIED >
> <!ELEMENT global (tabs)>
> <!ELEMENT hierarchies (sitemap+ , book+)>
> <!ATTLIST item  label  CDATA  #REQUIRED
>                   href   CDATA  #REQUIRED
>                   active  (true | false )  #IMPLIED
>                   target  (self | new )  #IMPLIED >
> <!ELEMENT reference (reference*)>
> <!ATTLIST reference  label CDATA  #REQUIRED
>                        href  CDATA  #REQUIRED >
> <!ELEMENT section (chapter+)>
> <!ATTLIST section  label CDATA  #REQUIRED
>                      path  CDATA  #IMPLIED >
> <!ELEMENT site (global , hierarchies , external-refs)>
> <!ATTLIST site  label CDATA  #REQUIRED
>                   href  CDATA  #REQUIRED
>                   xmlns CDATA  #REQUIRED >
> <!ELEMENT sitemap (folder+ , item?)>
> <!ATTLIST sitemap  label CDATA  #REQUIRED >
> <!ATTLIST tabs  level      (single | multi | none )  #REQUIRED
>                   selection CDATA  #REQUIRED >

View raw message