forrest-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ferdinand Soethe <>
Subject Re: Reusing legacy HTML
Date Wed, 16 Feb 2005 13:49:31 GMT
Hi David,

RG> Yes. Please everyone, it is a big problem when committers (and other
RG> developers) are contacted off-list. One-to-one discussions reduce our
RG> effectiveness and could even lead to burn-out.

My fault. Sorry. I was trying to understand a mechanism and didn't
want to clog the list with a lengthy discussion on my rather low level
of understanding. The goal was to document this and put it
up for discussion as soon as I knew what I was talking about.

Well anyway, Ross - after being very helpful in understanding it -
also pointed this out and asked that I post this early version to
the list for comments.

So here it is. Your comments much appreciated before I format it for
inclusion in the Forrest documention?

I was going to use document.dtd and write a second shorter and more to
the point how-to on the specifics of processing your own legacy html.

Ferdinand Soethe


So this tries to explain what happens internally when a clients asks Forrest to serve
"mytests/mybad.html", a legacy html-file with lots of junk in

0. Clients asks Forrest to serve ".../xdocs/mytests/mybad.html"

1. Forrest looks for a matching pipeline in

2. This Pattern  would in fact match the request but generates no xml since the
   map:parts match no cocoon pipeline and thus no xml is generated.
   <map:match pattern="*.html">
          <map:aggregate element="site">
            <map:part src="cocoon:/skinconf.xml"/>
            <map:part src="cocoon:/build-info"/>
            <map:part src="cocoon:/tab-{0}"/>
                           => cocoon:/tab-mytests/mybad.html
            <map:part src="cocoon:/menu-{0}"/>
                                   => cocoon:/menu-mytests/mybad.html
            <map:part src="cocoon:/body-{0}"/>
                           => cocoon:/body-mytests/mybad.html
          <map:call resource="skinit">
            <map:parameter name="type" value="site2xhtml"/>
            <map:parameter name="path" value="{0}"/>
                                            => mytests/mybad.html
3. This pattern also matches the request and is used to continue
        <map:match pattern="**/*.html">
                                                {0}= mytests/mybad.html
                                                {1}= mytests
                                                {2}= mybad
            <map:aggregate element="site">
              <map:part src="cocoon:/skinconf.xml"/> adds skin info
              <map:part src="cocoon:/build-info"/> adds meta data
              <map:part src="cocoon:/{1}/tab-{2}.html"/> creates tabs
              <map:part src="cocoon:/{1}/menu-{2}.html"/> creates menus
              Below a cocoon pipeline is called to generate the body 
              <map:part src="cocoon:/{1}/body-{2}.html"/>
             return here for the rest of this pipeline in step 9
4.  This is the pipeline called in step 3
    Check if there is an ehtml-file (deprecated embedded html)
        <map:match pattern="**body-*.html">
                                            {0}= mytests/body-mybad.html
                                                {1}= mytests/
                                                {2}= mybad
        <map:select type="exists">
          <map:when test="{project:content.xdocs}{1}{2}.ehtml">
            <map:generate src="{project:content.xdocs}{1}{2}.ehtml" />
            <map:transform src="{forrest:stylesheets}/html2htmlbody.xsl" />
            <map:transform type="linkrewriter" src="cocoon:/{1}linkmap-{2}.html"/>

            <map:transform src="resources/stylesheets/declare-broken-site-links.xsl" />
            <map:serialize type="xml" />

      Since file does not exist, pipeline generates nothing so Forrest
      keeps looking for next matching pipeline ...
5. ... and finds another pipeline for the same matches

  <!-- Default matches -->
  <!-- (HTML rendered from doc-v11 intermediate format -->
  <map:match pattern="**body-*.html">
                                            {0}= mytests/body-mybad.html
                                                {1}= mytests/
                                                {2}= mybad
    In the following step we ask Forrest to call the pipeline for mybad.xml.
    This triggers a new matching attempt starting from the top of the pipeline section.
    <map:generate src="cocoon:/{1}{2}.xml"/>
    Return here for the rest of this pipeline in step
6. This below is relevant now as it loads the project sitemap and
   inserts it right at this position of the main sitemap. (This project
   sitemap was also loaded before, but was irrelevant since there were no matches
   in the project sitemap)

     This is the user pipeline, that can answer requests instead
     of the Forrest one, or let requests pass through.
     To take over the rendering of a file it must match the file name and path.
     To take over the generation of the intermediate format, it must give
     Forrest the same filename but ending with xml, and a DTD that Forrest
  <map:pipeline internal-only="false"> 4t!!!h step patterns above first
       <map:select type="exists">
         <map:when test="{project:sitemap}">
           <map:mount uri-prefix="" 

7. In the project sitemap we find this match for our call for an XML-file!

        <map:match pattern="**/mybad.xml">
                                                {0}= mytests/mybad.xml
                                                {1}= mytests

        Load my file with the html-generator. This generator
        internally uses jtidy to clean up the html and make it xhtml.
        <map:generate src="{project:content.xdocs}{1}/mybad.html" type="html"/>

        Now we call my special stylesheet to remove all
        elements that I don't want in the forrest page.
        I place it in the same directory as the source document as it
        is very specific.
        <map:transform src="{project:content.xdocs}{1}/mybadHTMLfixer.xsl"/>

        Finally call the existing stylesheet to convert html to document1.1
        <map:transform src="{forrest:stylesheets}/html2document.xsl" />
        Serialize result as xml (it is now the body of my Forrest page
        and uses in document.dtd)
        <map:serialize type="xml"/>
 8. Return to calling routine in step 5 and execute the rest of the pipeline
    to finalize the body of my Forrest page.
                        {0}= mytests/body-mybad.html
                                                {1}= mytests/
                                                {2}= mybad
            <map:transform type="idgen"/>

            <map:transform type="xinclude"/>

            Adjust links
            <map:transform type="linkrewriter" src="cocoon:/{1}linkmap-{2}.html"/>
                                                  => cocoon:/mytests/linkmap-mybad.html
            <map:transform src="resources/stylesheets/declare-broken-site-links.xsl" />
            <map:call resource="skinit">
              <map:parameter name="type" value="document2html"/>
              <map:parameter name="path" value="{1}{2}.html"/>
                                              => mytests/mybad.html 
              <map:parameter name="notoc" value="false"/>
   At the end of the pipeline this is the page body in Html with all
   links adjusted.
9. Return to the calling routine in step 3 and finish processing

                                {0}= mytests/mybad.html
                                                {1}= mytests
                                                {2}= mybad

                At this point the body (as html) is aggregated with the menus and tabs
                and the next part just adds the final touches to the presentation.
            <map:call resource="skinit">
              <map:parameter name="type" value="site2xhtml"/>
              <map:parameter name="path" value="{0}"/>
                                              => mytests/mybad.html

         At the end, the result is delivered to the browser. 

View raw message