cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grzegorz Kossakowski <g...@tuffmail.com>
Subject Re: caching RSS feeds or external xml files (WAS: Caching jx with flow)
Date Tue, 04 Jul 2006 11:03:21 GMT
Ard Schrijvers napisaƂ(a):
> <snip/>
>
> <!-- question -->
> Considering caching RSS feeds. Is there any option to configure cocoon 
> (maybe by using some source protocol?) to behave like proxy and do not 
> break the rest of cached content?
> Any chance to do conditional gets? [1]
>
> [1] http://fishbowl.pastiche.org/2002/10/21/http_conditional_get_for_rss_hackers
>
> <!-- -->
>
> I know there are some proxy generators around (see blocks/proxy for GenericProxyGenerator.java,
HttpProxyGenerator.java, WebServiceProxyGenerator.java), though I haven't had a close look
at them, but I don't think they do what you want them to do, and probably you haven't configured
them anyway.
>   
IMHO generator is not suitable for this task as it would lead to bogus 
pipelines only for proxing. E.g. you want include your RSS via XInclude, 
then you have to build
<map:generate type="someproxy"/>
<map:serialize type="xml"/>
And call this pipeline thought cocoon:/ protocol. Given that, it seems 
that http scheme/Excalibur's source is more convenient.
> Anyway, by default, when generating an external rss, you will probably have something
like:
>
> <map:generate src="http://www.foo.org/rss"/>
>
> Unless you have specified a specific source-factories for http in your cocoon.xconf,
this uri scheme will be managed by org.apache.excalibur.source.impl.URLSourceFactory. I am
not sure if this URLSourceFactory delivers what you want, but I don't think so. Then again,
I think when working with external (http) sources like rss, one wants the following behavior:
>
> 1) fetch the rss/xml only when modified
> 2) have the fetched content cached
> 3) have the fetched content cached with an expires (configurable)
> 4) have a configurable connection time out (we had cocoon apps completely hanging on
failing external rss feeds)
>   
> I think cocoon is particularly well suited for xml content aggregation of external sources,
but I haven't found something that managed the above 4 (logical) demands for it (Anybody....is
it in there already? Portal block? Do I miss something?)
>   
For the last point, there was implementation[1] of Excalibur's source 
which was meeting our expectations. Good explanation how to use it was 
there[2]. However, this code was in 'scratchpad' block and was removed 
in 2.1.7 release as no one was maintaining it. Valuable gems has been 
lost, unfortunately.

> I want this behavior to have the following behavior:
>
> Fetching rss/xml only when modified is clear I suppose. Caching the content is necessary
to have the content available when you get a 304. You might have an external rss that lacks
proper headers, making your generator having to regenerate the rss every time. Therefore,
making the cache also expiring gives you the possibility to only ask for the rss every X minutes.
At last, quite an important one, a failing external rss can make your cocoon app waiting very
long for the external source, resulting in a broken front end.
>
> I think it is not very hard to make an external rss/xml generator that meets these 4
points (or adding a http scheme to the source-factories with a component-instance that behaves
accordingly), but I am afraid that it must be in cocoon somewhere already...
>   
As for now I cannot work on this but in not-so-distant future I could 
port and document cached source into 2.1.x's core of course question is 
if patch would be accepted.
My gut feeling is that conditional gets are not implemented in 
Cocoon/Excalibur and I have no idea if would be hard "fix" it.
Any hints greatly appreciated.

-- 
Grzegorz Kossakowski

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message