forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Portier <...@outerthought.org>
Subject Re: Link-Addressing, and breaking up the sitemap
Date Fri, 06 Sep 2002 14:03:34 GMT


Jeff Turner wrote:
> On Thu, Sep 05, 2002 at 05:36:58PM +0200, Marc Portier wrote:
> 
>>>+1. Just like how JSP tags all treat '/'-leading paths as relative to the
>>>servlet context, not the server root.
>>>
>>
>>yep, glad you like the idea more then the word :-)
> 
> 
> Though it later occurred to me that the sitemap renders the document, so
> it must be responsible for 'relativising' all links contained in it, not
> the crawler.
> 
that was my first idea: some transformer, but it would be active 
also in the webapp case (it wouldn't hurt, except for the elapsed 
time)

> It's all so confusing. I propose we wait until physicists come up with a
> Theory of Everything, and then we work backwards from there to discover
> the Theory of Forrest Linking.
> 
in this case, the physicist is called Nicola, if he thinks it 
would be feasible for the crawler to do this (or someone else 
under his guidance) we could just do it, no?

> 
>>>It would be best to define the goals first:
>>>
>>>- Users need to be able to customize the sitemap with their own
>>>  matchers, for whatever crazy reasons they want.
>>
>>maybe we should find reasons just to make sure this _is_ a valid 
>>goal, AND to make sure that those 'crazy' reasons will be 
>>satisfied by allowing snippets of sitemaps?
> 
> 
> Good point.
> 
> 
>>>- In addition, we'd like to provide quick'n simple ways of doing routine
>>>  customizations, like specifying javadoc prefixes, and adding pipelines
>>>  for new document types (docbook, say). Ie, a siteplan.
>>
>>adding pipelines for the document types should be covered by the 
>>CAPs and the previous discussion no?
>>
>>mmm, provided the end-user can add to that of course, but again: 
>>maybe the CAP could be told in a different way then the sitemap 
>>about possibly new stuff?
>>
>>mmm, lets look at how the CAP turns out first, then we can 
>>discuss on letting it know about newer stuff
> 
> 
> Okay. You're probably right, that the CAP can be configured externally to
> the sitemap. From Steven's mail, I imagine that CAPs would work as
> follows:
> 
> <map:match pattern="*.html">
>   <map:act type="CAPAction">
>     <map:parameter name="config" value="doctypes.properties"/>
>     <map:generate src="{1}.xml"/>
>     <map:transform src="stylesheets/{doctype}2docv11.xsl"/>
>     <map:transform src="stylesheets/document2html.xsl"/>
>     <map:serialize/>
> </map:match>
> 
> Where 'doctypes.properties' contains doctype <-> public id mappings:
> 
> docbook=-//OASIS//DTD DocBook XML V4.1.2//EN
> docv11=-//APACHE//DTD Documentation V1.1//EN
> docv10=-//APACHE//DTD Documentation V1.0//EN
> faq=-//APACHE//DTD FAQ V1.1//EN
> 
> 

yep, I hear he had Bruno cook something up... now on his hard 
drive, let us hope for a check-in soon

> 
>>>- Forrest's sitemap needs to be modularized, so users can choose just
>>>  the functionality they need. If they don't want svg2png, don't include
>>>  Batik. If they don't want PDFs, don't include FOP. 
>>>
>>
>>isn't this the kind of challenge that asks cocoon to be 
>>modularized first?
> 
> 
> I don't know. Would having Cocoon blocks give us a more modular sitemap,
> any more than <map:mount> gives us?
> 

beats me, it just sounded like fop and batik blocks to me :-S

> 
>>could be me, but it is probably not the first goal of forrest to 
>>do that cocoon-work?
>>
>>
>>>To meet these goals, I'd like to chop the sitemap into functional
>>>sections:
>>>
>>>- Straight *.xml to *.html
>>>- Site statistics reporting (apachestats)
>>>- todo generation
>>>- changelog generation
>>>- faq
>>>- 'community' section with feedback
>>>- doclist
>>>- DTD documentation
>>>- PDFs-of-every-page
>>>
>>>Each of these is in a sitemap 'snippet'. Users can add project-specific
>>>functionality by adding new snippets.
>>>
>>
>>mmm, most of the ones you mention are about types and the 
>>pipeline to get them through the 2-step-view rendition towards 
>>pdf, html, whadayanameit... so that is the stuff we covered with 
>>the CAPS & hints discussion, no?
> 
> 
> Yes, true.
> 
> 
>>the new thing I want to address is how to find and 
>>cross-reference these documents...
> 
> 
> Isn't the linking issue is completely separate from the issue of how to
> augment, modularize and customize the sitemap?
> 
> Oh, there it is; "Link Addressing" in the subject. I dragged this thread
> waaay off course :P Sorry..
> 

focus my man, and less mind-expanding chemicals :-)

> <snip topic=linking>
> 
>>3. we could possibly aid more...
>>- in all cases the enduser still needs the ant tasks (foreign 
>>processes) to generate the javadoc (or other stuff)
>>- he knows where they are relative to his project.home
>>- we let him tell forrest (1) where that is, (2) how other 
>>documents are referencing the root of this stuff...
> 
> 
> Good.
> 
> 
>>my current guess would be to do that with the XML snippet I 
>>proposed... (but possibly you all feel that letting him write the 
>> mingle-into-build/documentation/ ant is easier?)
> 
> 
> Ignore all my stuff about merging snippets.. it's irrelevant to this
> links topic. Anyway, so we have this XML snippet:
> 
>  <content>
>    <part name="doc"
>          location="./src/documentation/content/xdocs"/>
>    <part name="mail"
>          location="..." />
>    <part name="jdoc"
>          location="..." />
>  </content>
> 
> Now the question is how to process it.
> 

given the bot hack this could see a quick prototype that just 
produces a piece of ant script through xslt to be included...

in the long run it's maybe better to actually make it an ant-task

> 
>>then let some smart ant-task (part of the forrest activity) read 
>>that file, copy the described stuff over into the cocoon context 
>>dir (we stay in charge of location and organization) where the 
>>CAP-hints pipeline deals with it as soon as the webapp or crawler 
>>asks for it
> 
> 
> Okay. What happens to the HTML that contained the link? We need to
> rewrite the link to point to the 'location' attribute. As the link is
> part of the contents, which is rendered by the sitemap, we need an XSLT
> or something that rewrites links:
 >
 > <xsl:template match="link">
 >   <link>
 >     <xsl:attribute name="href">
 >       <xsl:if test="@href = 'doc'">
 >         <xsl:value-of select="./src/documentation/content/xdocs"/>
 >       </xsl:if>
 >       <xsl:if test="@href = 'mail'">
 >         <xsl:value-of select="..."/>
 >       </xsl:if>
 >       ...
 > </xsl:template>
 >
 > Suitably generalized, of course.

mmm, the idea would be that you write your documents in such a 
way that they use a common ref-prefix '/bar' for say docs 
generated by foo...

then you mention
<part name="bar" location="./build/where-foo-dropped-it" />

and tzadaam they get merged in


it is of course a bit naieve?
in my first attempt at describing this I also thought about 
reference-aliases for the case where you can say
<part name="bar" location="./build/where-foo-dropped-it" >
    <ref-alias name="old-bar" />
    <ref-alias name="bar2" />
</part>

this would mean that there are still documents that would use the 
old or alternative reference-prefix /old-bar resp /bar2 when they 
should be using /bar

for those I think some tuned transformer could be created
suggestions:
- SAX filtering on anything that looks like a link-attribute?
- having as config a small file that is generated from the 
original one, (or is just the original one where it is 
xpath-working out what it needs)


still, a 2nd line kind of feature if you ask me

> 
> Minor issue: if we're rewriting links, why bother copying the javadocs
> inside the Cocoon context? We could just prepend '../javadocs', tell
> Cocoon to ignore those links, and keep Javadocs outside. No need for Ant.
> 
I feared that the javadoc example would lead to this....
I was supposing that maybe one would want to e.g. start from 
xml-javadocs ore something like that?

and there is other content to consider, no?

but you _are_ right: we could take option 1: ignore that there is 
anything else

> So if I'm not mistaken, the whole thing boils down to one link-rewriting
> stylesheet.
> 
> I'll try implementing it on my own project now.
> 

let us know where it's leading you...

> 
> --Jeff
> 
> <snip stuff about merging sitemap snippets as it's not relevant>
> 
>>-- 
>>Marc Portier                            http://outerthought.org/
>>Outerthought - Open Source, Java & XML Competence Support Center
>>mpo@outerthought.org                              mpo@apache.org
>>
> 
> 

-marc=
-- 
Marc Portier                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
mpo@outerthought.org                              mpo@apache.org


Mime
View raw message