cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From (Stephen R. Savitzky)
Subject Re: Sitemap and Links...
Date Wed, 12 Jan 2000 17:40:37 GMT
Stefano Mazzocchi <> writes:

> "Stephen R. Savitzky" wrote:
> > 
> >  1. There is a map element for each directory in URL-space, which either
> >     comes from a file in the directory itself, or is inherited from the
> >     parent directory.  Hence, the site map scales naturally with the size of
> >     the site, and there is no need to load descriptions for directories
> >     until they are accessed.
> I see no problems in adding something like this. In fact, it follows the
> .htaccess pattern in apache.

That's essentially what it was based on.

> The problem, though, goes again to XML Inheritance: how do we
> include/merge/import those files in the original sitemap?

We don't necessarily have an overall sitemap document -- it's constructed
dynamically.  In other words, the first time any directory is accessed, its
local sitemap file is merged into the current data structure, which is
maintained in memory as a DOM tree. 

> >  2. The map file can specify multiple directories which are effectively
> >     overlaid, and can specify resources elsewhere in the filesystem:

> ?? I don't get this.

I wasn't particularly clear, I'm afraid.  Basically, every URL in the tree
has two potential locations: the "real" one and the "virtual" one.  All
writing is done into the "real" location.  What that means is that you can
have all the "original" documents of a website mounted read-only, perhaps
on a CD-ROM, and write new versions of them (customized documents, reader
comments, data, and so on) into the "real" location.

We came up with this in the context of personal, web-based applications that
sit on a proxy server.  The documents and stylesheets that represent the
delivered application are shared among all users; any user can customize any
document without interfering with other users. 

> >  3. There are no wildcards: resources are either described by full name or
> >     by extension:
> No, we need more power than that.

You may be right, but there may be other ways of getting that power.  The
main use of wildcards would appear to be achieving the same kind of
scalability that one can get by distributing configuration files into the

  /foo/*/bar.xml could just as easily be specified by something in foo that
  is inherited by its subdirectories.

See my last paragraph though; I think extensions are just a special case
of wildcards in most cases.  The only time they aren't is the case described
in [4] immediately below -- extensions are special because there are some
particularly obvious and easy things to do with them. 

> >  4. Most importantly, URL's passed to the server are not required to have an
> >     extension.  Extensions are tried in the order specified in the map.  So
> >     in the example above,

> This is a very interesting point. Reading Tim Berners Lee paper about
> "good URIs don't change", it would be IMO very significant to
> "slightly-suggest" something like this: read, never use file extentions
> to mean something

That's not exactly what I meant.  We never expose extensions to the client,
but this leaves us free to use them all over the place internally to
associate styles with documents.  It would be more difficult to use more
general wildcards for this -- it's really easy to search a finite list of

It does directly relate to Tim's point, though.  If I ask for /foo/bar the
server may hand me /foo/bar.html, /foo/bar/index.html, or something derived
on the fly by processing /foo/bar.xml with a logicsheet.  And I can go
through a site and process bar.xml into bar.html offline and get a major
speedup, without affecting links at all.

> >  5. Elements without a namespace prefix become "properties" that are
> >     accessible via a NamedNodeMap while documents are being processed.
> >     Inside of documents they're accessible as predefined external entities.
> Hmmm, I'm not sure.. can you give us a real example?

This is something we got from WebDAV, which uses elements as properties:

  <site:Resource name="foo.xml">
	<author>S. Savitzky</author>

  Inside of foo.xml, &author; is automatically defined as "S. Savitzky".
  Properties defined in this way are (will be) also accessible from a
  client via WebDAV. 

> > It might also be worthwhile looking at the W3C's RDF (Resource Description
> > Framework), and the IETF's WebDAV framework.  Our scheme is vaguely based on
> > ideas from a combination of these.
> I admittedly skipped most of RDF (too complex), but I know most of
> WebDAV. The second is very handy but it's designed to be orthogonal to
> existing URI spaces so it doesn't help that much, IMO.
> I'll go over RDF again....

I agree that RDF is too complex; I looked at WebDAV first and basically
ignored RDF.  But WebDAV looks like it maps directly into the kind of
sitemap "document" that we both need -- a single XML document that contains
arbitrary XML metadata.

Most of the PIA's extension mapping stuff, and your wildcard stuff, are
basically ways of compactly associating metadata (like styles and processing
steps) with documents.  There are probably other ways that are just as good
if not better, including something like the X window system's Xdefaults
database.  Now that I think of it, that's almost _exactly_ what we need.

Stephen R. Savitzky  <>  <>
Platform for Information Applications:      <>
Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center
 voice: 650.496.5710  front desk: 650.496.5700  fax: 650.854.8740 
  home: <> URL:

View raw message