Return-Path: Delivered-To: apmail-xml-cocoon-dev-archive@xml.apache.org Received: (qmail 99787 invoked by uid 500); 1 Jul 2003 18:47:32 -0000 Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: cocoon-dev@xml.apache.org Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 99764 invoked from network); 1 Jul 2003 18:47:31 -0000 Received: from out003pub.verizon.net (HELO out003.verizon.net) (206.46.170.103) by daedalus.apache.org with SMTP; 1 Jul 2003 18:47:31 -0000 Received: from verizon.net ([139.85.116.155]) by out003.verizon.net (InterMail vM.5.01.05.33 201-253-122-126-133-20030313) with ESMTP id <20030701184736.NNXH4805.out003.verizon.net@verizon.net>; Tue, 1 Jul 2003 13:47:36 -0500 Message-ID: <3F01D747.30605@verizon.net> Date: Tue, 01 Jul 2003 14:47:35 -0400 From: Vadim Gritsenko User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3) Gecko/20030312 X-Accept-Language: en-us, en MIME-Version: 1.0 To: cocoon-dev@xml.apache.org CC: uv@upaya.co.uk Subject: Re: Link view goodness (Re: residuals of MIME type bug ?) References: <3EFD7701.4322.3730626@localhost> <3EFDB68F.15751.46B49FD@localhost> <20030629060337.GB1848@expresso.localdomain> In-Reply-To: <20030629060337.GB1848@expresso.localdomain> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Authentication-Info: Submitted using SMTP AUTH at out003.verizon.net from [139.85.116.155] at Tue, 1 Jul 2003 13:47:35 -0500 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Jeff Turner wrote: >I'm not very familiar with the code; is there some cost in keeping the >two-pass CLI alive, in the faint hope that caching comes to its rescue >one day? > Guys, Before you implement some approach here... Let me suggest something. Right now sitemap implementation automatically adds link gatherer to the pipeline when it is invoked by CLI. This link gatherer is in fact is "hard-coded links view". I suggest to replace this "hard-coded links view" a.k.a link gatherer with the "real" links view, BUT attach it as a tee to a main pipeline instead of running it as a pipeline by itself. As a result, links view "baby" will be used, two-pass "water" will be drained, and sitemap syntax will stay the same. Moreover, the links view will be still accessible from the outside, meaning that you can spider the site using out-of-the-process spiders. Example: Given the pipeline: G --> T1 (label="content") --> T2 --> S, And the links view: from-label="content" --> T3 --> LinkSerializer, The pipeline built for the CLI request should be: G --> T1 --> Tee --> T2 --> S --> OutputStream \ --> LinkSerializer --> NullOutputStream \ --> List of links in environment In one request, you will get: * Regular output of the pipeline which will go to the destination Source * List of links in the environment which is what link gatherer was made for Comments? Vadim