forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Koberg" <...@koberg.com>
Subject RE: site.xml -> was -> RE: [RT] Entities in XML docs
Date Sat, 28 Dec 2002 16:39:12 GMT
Hey there,

> -----Original Message-----
> From: Jeff Turner [mailto:jefft@apache.org]
> Sent: Saturday, December 28, 2002 7:40 AM

> On Sat, Dec 28, 2002 at 05:36:32AM -0800, Robert Koberg wrote:
> ...
> > > > <page id="dreams"/>
> > >
> > > Less typing :)  And trying to treat all URL-addressable parts of the site
> > > in the same way.  It shouldn't matter if a node is a directory, file or
> > > #anchor.  In a linkmap, they're all just "things to link to".
> > >
> > > > To me, this allows for 'grouping' of IDs at the element level.
> > >
> > > How do you mean, grouping?
> >
> > [assuming?? that there will be metadata in the site.xml page and
> folder elements
> > and so you can't simply test for children, but even still, it is an
> extra test
> > in the transform]
>
> Oh I see.  Yes, I did need to use the 'has children == folder' rule to
> generate book.xml, which will indeed break with metadata.
>
> > I mean I want to know explicitly if something is a page or a
> folder. For example
> > in the lsb site's nav we have folder icons before folder labels and
> page icons
> > before page labels. If I know it is a page I can just:
> >
> > <xsl:template match="page">
>
> Though, it would be just as easy to make the node type an attribute, and
> match on that:
>
> <xsl:template match="*[dc:format='Page']">

Sure, but who is doing more typing now? :) That is:

one:
<page id="dreams">

and several:
<xsl:template match="page" mode="nav">
<xsl:template match="page" mode="snailtrail">
<xsl:template match="page" mode="path_builder">
etc

looks prettier this way too, but basically you are right, it would not matter if
there is an attribute for this.

More importantly, however, if you use any element name as the ID you cannot
validate the document, as you mentioned previously. Well you could but the
schema could not easily be reused (for non-apache-like sites). I would think
this is a showstopper.

>
> >   <xsl:variable name="href">
> > <!-- travels up and down the tree to find ../'s and path -->
> >     <xsl:call-template name="page_path_builder"/>
> >   </xsl:variable>
> >   <a href="{href}">
> >     <img src="{$relative_path}images/page_icon.gif"/>
> >     <xsl:value-of select="@label"/>
> >   </a>
> > </xsl:template>
> >
> > In building the href at generation time, I know that since it is a
> page I will
> > use (depending on site prefs) either the page ID or page label
> (replacing things
> > like spaces, :, ', etc) and then concatenate the file extension.
>
> Well if you stick to generic attributes, instead of *page* Id and
> label, then you can just glue the href's together and see what you end
> up with :) Eg, with:
>
> <site>
>   <primer label="Forrest Primer" href="primer.html">
>     <cvs href="#cvs"/>
>   </primer>
> </site>
>
> Then <link href="site:cvs"> gets translated to <a href="primer.html#cvs">

What happens if you have:

<site>
  <primer label="Forrest Primer" href="primer.html">
    <cvs href="#cvs"/>
  </primer>
  <old_primer label="Old Forrest Primer" href="old_primer.html">
    <cvs href="#cvs"/>
  </primer>
</site>

You need unique IDs.

>
> The original idea with site.xml was that it is a totally abstract
> representation of the site's information content.  Eg, it should be
> possible to replace the filesystem with a Xindice database, and have only
> the source URIs in site.xml change.  Say we have a FAQ entry:
>
> <site>
>   <faq>
>     <how_can_I_help />
>     <build_problems />
>     <useless_docs />
>   </faq>
> </site>
>
> One day, each entry might be mapped to an XML node:
>
> <site>
>   <faq src="faq.xml">
>     <how_can_I_help src="#xpointer(/faqs/question[@id='how_can_I_help'])"/>
>     <build_problems src="#xpointer(/faqs/question[@id='build_problems'])"/>
>     <useless_docs src="#xpointer(/faqs/question[@id='useless_docs'])"/>
>   </faq>
> </site>
>
> Then, by only changing @src attributes, we could map to Xindice:
>
> <site href="xmldb:xindice://localhost:4080/db/website">
>   <faq src="faq">
>     <how_can_I_help src="#/faqs/question[@id='how_can_I_help']"/>
>     <build_problems src="#/faqs/question[@id='build_problems']"/>
>     <useless_docs src="#/faqs/question[@id='useless_docs']"/>
>   </faq>
> </site>
>

I would do this currently by using an alternate URIResolver, but I am very
interested in your approach.


>
> So that's all very nice, but it's turning out to be not very
> practical.  Even to generate book.xml, I had to add these horrible
> non-addressable 'category' elements for grouping nodes:

I am not following this ?

>
> <getting-involved label="Getting Involved">
>   <contrib label="Contributing" href="contrib.html"/>
>   <CVS label="CVS"
>     href="http://cvs.apache.org/viewcvs/xml-forrest/"/>
>   <mail-lists label="Mail lists" href="mail-lists.html"/>
>   <mail-archives label="Mail Archives"
>     href="mail-archives.html"/>
>   <bugs label="Bugs and Issues"
>
> href="http://issues.cocoondev.org/jira/secure/BrowseProject.jspa?id=10000"/>
> </getting-involved>
>
>
> > If it is a folder, I will just append index.{html | jsp | php} and
> be done.  I
> > have a property in a folder_conf element that tells me the
> index_page - this is
> > copied at generation time to index.html.
>
> Oh yes.  index_page is another thing we really need a way to indicate.
> At the very least, it can be present in menus of subdirectories as a
> '../' link.

site_index at the top config level is nice too. Think of special holiday
promotions or some such thing.

>
> > - Or perhaps I want to create a pager (<< 1 2 3 4 >>) to have each
> 'page' in a
> > directory show up in the horizontal list, but I don't want child folders.
> >
> > - Or I want to create a site map/index page that shows the site
> structure with
> > meaningful icons/colors
> >
> > - Or I might want to offer a folder with individual page views or
> the option to
> > see all the pages (not folders) aggregated into one page view.
> >
> > - Or I might want to create an folder index page from a folder's pages using
> > dc:titles and dc:descriptions
>
> mm :)  Good ideas..
>
> > I don't see how to do the above without extra xsl:choose's or xsl:if's
>
> Or *[@dc:format='whatever'] I assume.

Sure, but you are looking at too many things to find what you need. First you
match all child elements and then have to check the appropriate attribute. The
way I am advocating would just check the element name.

>
> > > > On book.xml - why is this needed anymore? Cannot the site.xml be
> > > > used in its place?
> > >
> > > Yes, book.xml isn't necessary anymore (in the linkmap CVS branch).
> > > It's still kept around as an intermediate format (see
> > > site2book.xsl) so that if necessary, users can specify it directly
> > > rather than generate from site.xml.  There are various cases where
> > > the desired menu is not the same as that generated from site.xml.
> > > In Forrest's own site, we could not generate these menus from
> > > site.xml:
> > >
> > > http://xml.apache.org/forrest/community/index.html
> > > http://xml.apache.org/forrest/community/howto/index.html
> > >
> > > Whether these pages show good menu design is another question :)
> >
> >
> > That is why you should always storyboard out the site/project before
> > setting the contracts in stone :)
>
> Our customer pays very poorly ;P

oh yea, those bastards :)

>
> > > > On the metadata front, I have been adopting a mix of Dublin Core
> > > > and mixing in the stuff my tool requires. For example, at the
> > > > bottom is a snippet of what I am currently using in the site.xml
> > > > [1].
> > >
> > > Nice!  RDF, Dublin Core.. I see an opportunity for more shameless
> > > LSB-copying ;)
> >
> > I would love it! I am trying to bend toward forrest so I can
> > eventually publish a forrest site. But I need the metadata for a
> > flexible storyboarding process.
> >
> >
> > >
> > > I'm not sure I understand it fully though..
> > >
> > > > <lsb:folder name="en-us">
> > > >     <lsb:folder_conf>
> > > >       <rdf:Description about="folder.dcxml">
> > > ...
> > > >       </rdf:Description>
> > > >     </lsb:folder_conf>
> > > >     <lsb:page_conf>
> > > >       <rdf:Description about="preamble">
> > > ...
> > > >       </rdf:Description>
> > > >     </lsb:page_conf>
> > >
> > > I gather this is describing a directory 'en-us', and a file
> > > en-us/preamble?  What is 'folder.dcxml'?
> >
> >
> > I started out creating a metadata file (*.dcxml) for each resource
> > on the site and at app startup I would crawl the metadata and
> > aggregate those into one site.xml. I found that to be a CVS
> > nightmare given the fact that I allow pages and folders to be moved
> > around. So I went back to just having the static site.xml and at
> > generation time I either include the metadata inline (page level) or
> > write it to a file (folder, binary, ??).
>
> I like the idea of storing a RDF file in each directory, providing
> metadata for those files (and overall directory metadata).  What was
> the CVS nightmare?  .dcxml files needing to be updated on every file
> move?


I am trying to automate as much as possible. Would it be a good idea to use java
to control CVS to remove dirs/files -> commit and add dirs/files -> commit? I am
not good at this type of thing but I understand that you should not script
commits???

If a developer user (as opposed to an editor/author) wants to grab the latest
from CVS (chroot jail) I would want them to have the latest. But even if I
postponed the commit on the server it would require someone/thing to do the
commit and make sure everything is OK (there might not be a developer user in a
project).

By using site.xml for updates/changes I don't have to worry as much. Then at
generation time The metadata get put out as individual files on included in the
HTML.

>
> > The lsb:folder tells me the location of the *.dcxml (perhaps I
> > should use *.rdf...) and the rdf:Description tells me the file name.
>
> In that case, <rdf:Description about="folder.dcxml"> means "here's
> some metadata about a file containing metadata about the folder",
> which doesn't sound right?

The metadata for the folder would only exist in the generation output. Meanwhile
it lives in the site.xml.


>
> > > I don't really understand how a
> > > directory could be considered to have a title, subject etc.  Is that just
> > > indicating what the directory should contain?
> >
> > It is a test site.xml that is using things I 'might' want to play
> with. But as a
> > solid case, like I mentioned above, you might want to have a folder offer
> > individual pages or one inclusive, aggregated view. In the latter case the
> > folder is actually a page.
>
> I see, makes sense.
>
> > But you could create your schema to include anything you want and
> > perhaps setting hardcoded values for some items.
> >
> >
> > >
> > > If there is a lsb:folder, shouldn't there be a lsb:page too?
> >
> >
> > My thinking (which could easily change) was that lsb:folder's are a virtual
> > representation of the folder-file system as it should be after
> generation. The
> > lsb:folder_conf holds meta info about a folder (including
> navigation - lsb:nav -
> > items). The lsb:page_conf, among other things, describes one or
> more possible
> > page views (don't know if I am using dc:format correctly...):
> >
> > <rng:optional>
> >   <rng:element name="format" xmlns:dc="http://purl.org/dc/elements/1.1/">
> >     <rng:value type="token">text/html</rng:value>
> >   </rng:element>
> > </rng:optional>
> > <rng:optional>
> >   <rng:element name="format" xmlns:dc="http://purl.org/dc/elements/1.1/">
> >     <rng:value type="token">text/plain</rng:value>
> >   </rng:element>
> > </rng:optional>
> > <rng:optional>
> >   <rng:element name="format" xmlns:dc="http://purl.org/dc/elements/1.1/">
> >     <rng:value type="token">application/pdf</rng:value>
> >   </rng:element>
> > </rng:optional>
> >
> > I represent these in a form and let the user 'check' which views to
> generate.
> > Still working on this...
>
> I don't know how this 'configure the site generation' part would fit
> in with Cocoon.  Perhaps when Cocoon blocks arrive, we could have a
> 'add PDF block' checkbox which adds the *.pdf rules.


Yea, I will probably have to bend more in this direction.

>
> > > Is it necessary to have the intermediate *_conf elements?  Why not just
> > > have <lsb:folder> and <rdf:Description> directly inside it?
> >
> > I want to know what the thing's group is to ease template matching :)
>
> I don't understand.  What XPath expression is possible with:
>
> <lsb:folder name="en-us">
>    <lsb:folder_conf>
>       <rdf:Description about="folder.dcxml">
>
> But not with:
>
> <lsb:folder name="en-us">
>       <rdf:Description about="folder.dcxml">
>
> If rdf:Description is the only child of lsb:folder, you could just do
> match="rdf:Description[../lsb:folder]".

I guess we are debating personal preferences that can be handled in many ways. I
simply like to group things semantically. rdf:Description, for me, is not
engough information. It's like

<div class="note">This is a note</div>
vs
<note>This is a note</note>

But what you are proposing by using keys as element names goes too far, in my
opinion, because it cannot be validated.


best,
-Rob


Mime
View raw message