Return-Path: Delivered-To: apmail-cocoon-users-archive@www.apache.org Received: (qmail 70184 invoked from network); 4 Jul 2006 11:03:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 Jul 2006 11:03:44 -0000 Received: (qmail 56659 invoked by uid 500); 4 Jul 2006 11:03:40 -0000 Delivered-To: apmail-cocoon-users-archive@cocoon.apache.org Received: (qmail 56591 invoked by uid 500); 4 Jul 2006 11:03:40 -0000 Mailing-List: contact users-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: users@cocoon.apache.org List-Id: Delivered-To: mailing list users@cocoon.apache.org Received: (qmail 56580 invoked by uid 99); 4 Jul 2006 11:03:40 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jul 2006 04:03:40 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of grek@tuffmail.com designates 205.237.194.35 as permitted sender) Received: from [205.237.194.35] (HELO mxout-04.mxes.net) (205.237.194.35) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jul 2006 04:03:37 -0700 Received: from [80.240.191.89] (unknown [80.240.191.89]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTP id 2C5CCA329F for ; Tue, 4 Jul 2006 07:03:15 -0400 (EDT) Message-ID: <44AA4AF9.9060203@tuffmail.com> Date: Tue, 04 Jul 2006 13:03:21 +0200 From: Grzegorz Kossakowski User-Agent: Thunderbird 1.5.0.4 (X11/20060614) MIME-Version: 1.0 To: users@cocoon.apache.org Subject: Re: caching RSS feeds or external xml files (WAS: Caching jx with flow) References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Ard Schrijvers napisaƂ(a): > > > > Considering caching RSS feeds. Is there any option to configure cocoon > (maybe by using some source protocol?) to behave like proxy and do not > break the rest of cached content? > Any chance to do conditional gets? [1] > > [1] http://fishbowl.pastiche.org/2002/10/21/http_conditional_get_for_rss_hackers > > > > I know there are some proxy generators around (see blocks/proxy for GenericProxyGenerator.java, HttpProxyGenerator.java, WebServiceProxyGenerator.java), though I haven't had a close look at them, but I don't think they do what you want them to do, and probably you haven't configured them anyway. > IMHO generator is not suitable for this task as it would lead to bogus pipelines only for proxing. E.g. you want include your RSS via XInclude, then you have to build And call this pipeline thought cocoon:/ protocol. Given that, it seems that http scheme/Excalibur's source is more convenient. > Anyway, by default, when generating an external rss, you will probably have something like: > > > > Unless you have specified a specific source-factories for http in your cocoon.xconf, this uri scheme will be managed by org.apache.excalibur.source.impl.URLSourceFactory. I am not sure if this URLSourceFactory delivers what you want, but I don't think so. Then again, I think when working with external (http) sources like rss, one wants the following behavior: > > 1) fetch the rss/xml only when modified > 2) have the fetched content cached > 3) have the fetched content cached with an expires (configurable) > 4) have a configurable connection time out (we had cocoon apps completely hanging on failing external rss feeds) > > I think cocoon is particularly well suited for xml content aggregation of external sources, but I haven't found something that managed the above 4 (logical) demands for it (Anybody....is it in there already? Portal block? Do I miss something?) > For the last point, there was implementation[1] of Excalibur's source which was meeting our expectations. Good explanation how to use it was there[2]. However, this code was in 'scratchpad' block and was removed in 2.1.7 release as no one was maintaining it. Valuable gems has been lost, unfortunately. > I want this behavior to have the following behavior: > > Fetching rss/xml only when modified is clear I suppose. Caching the content is necessary to have the content available when you get a 304. You might have an external rss that lacks proper headers, making your generator having to regenerate the rss every time. Therefore, making the cache also expiring gives you the possibility to only ask for the rss every X minutes. At last, quite an important one, a failing external rss can make your cocoon app waiting very long for the external source, resulting in a broken front end. > > I think it is not very hard to make an external rss/xml generator that meets these 4 points (or adding a http scheme to the source-factories with a component-instance that behaves accordingly), but I am afraid that it must be in cocoon somewhere already... > As for now I cannot work on this but in not-so-distant future I could port and document cached source into 2.1.x's core of course question is if patch would be accepted. My gut feeling is that conditional gets are not implemented in Cocoon/Excalibur and I have no idea if would be hard "fix" it. Any hints greatly appreciated. -- Grzegorz Kossakowski --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org For additional commands, e-mail: users-help@cocoon.apache.org