Return-Path: Delivered-To: apmail-forrest-dev-archive@www.apache.org Received: (qmail 36122 invoked from network); 3 Jan 2006 06:30:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 3 Jan 2006 06:30:32 -0000 Received: (qmail 41988 invoked by uid 500); 3 Jan 2006 06:30:32 -0000 Delivered-To: apmail-forrest-dev-archive@forrest.apache.org Received: (qmail 41954 invoked by uid 500); 3 Jan 2006 06:30:31 -0000 Mailing-List: contact dev-help@forrest.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: dev@forrest.apache.org List-Id: Delivered-To: mailing list dev@forrest.apache.org Received: (qmail 41943 invoked by uid 99); 3 Jan 2006 06:30:31 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jan 2006 22:30:31 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of pbolger@gmail.com designates 64.233.184.202 as permitted sender) Received: from [64.233.184.202] (HELO wproxy.gmail.com) (64.233.184.202) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jan 2006 22:30:29 -0800 Received: by wproxy.gmail.com with SMTP id i7so230516wra for ; Mon, 02 Jan 2006 22:30:09 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=f5ozDEViYR49LAVUy1igkhBJNzHr1PJUYL0W898A/MPzYmK/Ugra5TuEMctjEZdJ65dyu+yWJQp3fBEUJHpGR1tAj1Wp6AXFmydrc5IOR2LDhZb3RNZjG6ba1XV5YuM5U7N42dTj9JZMRYmiiIWBLAWtMnB4TOY9ZaGT85g903Y= Received: by 10.65.119.13 with SMTP id w13mr352955qbm; Mon, 02 Jan 2006 22:30:08 -0800 (PST) Received: by 10.65.159.17 with HTTP; Mon, 2 Jan 2006 22:30:08 -0800 (PST) Message-ID: Date: Tue, 3 Jan 2006 19:30:08 +1300 From: Paul Bolger To: dev@forrest.apache.org Subject: Re: howto-custom-html-source In-Reply-To: <20051218015351.GA16934@igg.indexgeo.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <20051218014750.GB16740@igg.indexgeo.com.au> <20051218015351.GA16934@igg.indexgeo.com.au> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Thanks David. My apologies for the break in transmission: Xmas etc... I've had a go - a few goes actually - at getting this to work, and I'm still not getting anywhere. I've inserted the following into my sitemap.xmap file: It's the first entry in the section. Bearing in mind what you said about the directory separators I tried a few variations on the syntax. I found the result either passed the html page straight through, which I assume means that the match isn't being made, or produced the following error: test\src\documentation\content\xdocs\dirtyhtml\default.body.html (The system cannot find the file specified) This happened when I used the code above. As a matter of interest, how would one extend the match to include files with .htm and .asp extensions? paul b On 18/12/05, David Crossley wrote: > David Crossley wrote: > > Paul Bolger wrote: > > > I've been trying to get this to work, and I'm not sure what's going > > > wrong. I'll explain what I'd like to be able to do: I'd like to point > > > at a directory, and it's subdirectories, processing all html files so > > > that all content outside a #content div is stripped. > > > > Ah, that comment indicates a basic misunderstanding > > about how Cocoon operates. It doesn't actually process > > directories [1]. Rather it handles requests. Depending > > on the components of the URL, the sitemap will respond > > by matching certain patterns. > > > > You need a project sitemap (or plugin if it is common > > functionality) to intercept the specific matches that > > you want to transform. Any matches that remain are handled > > by the guts of forrest. > > > > Some of our documentation explains how to handle specific > > matches. As usual our docs need attention. This doc > > is close, but you need to wade through the example that > > it points to, because only part of that is relevant. > > http://forrest.apache.org/docs/project-sitemap.html > > > > Basically you need a project sitemap.xmap like this > > where "this-tree" is the directory tree to which > > you want to apply special processing ... > > > > > > > > > > > > > > Of course, that should be > > Also your "myStripContent" transformer could probably > just remove the bits that you don't want and then follow > it with the forrest html transformer. So ... > > > > > > > > > > (Caveat: Be careful with those directory separators > > in the match and generate components: The ** will match > > a slash. I just added the above for readability.) > > > > In other words, presume that the request is > > localhost:8888/some-dir/this-tree/foo/bar.html > > then your sitemap would fire and it would generate > > xml content from xdocs/some-dir/this-tree/foo/bar.html > > and apply your transformer to produce the forrest > > internal document structure. > > > > --oOo-- > > > > [1] Preparing a directory listing, say for a table > > of contents page is another matter. For that you > > would use more complex Cocoon sitemap operations. > > See DirectoryGenerator which traverses the directory > > tree generates an xml fragment. Apply a Transformer > > to that to turn it into forrest internal xml format. > > > > You would need to follow Cocoon sitemap docs. Start at > > http://forrest.apache.org/docs/project-sitemap.html > > Understand sitemaps and then see: > > http://cocoon.apache.org/2.1/userdocs/directory-generator.html > > > > We need to add an example to our seed-sample site. > > > > > This How-To is > > > very detailed and I've learnt a lot from it, but it'd be good to have > > > > > > a. and example file of sitemap.xmap with the extra element included (= I > > > can't find the place that it's supposed to go...) > > > > > > and > > > > > > b. an example xsl file. > > > > The stylesheet to strip everything except "div class=3Dcontent" > > is a simple XSLT operation. Not apporpriate for this list. > > The "XSL FAQ" is a fantanstic resource http://www.dpawson.co.uk/xsl/ > > and get Micahel Kay's book. > > > > -David > -- Paul Bolger 19 Raggatt St Alice Springs NT 0870 08 8953 6780