cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject caching RSS feeds or external xml files (WAS: Caching jx with flow)
Date Tue, 04 Jul 2006 10:34:23 GMT
<snip/>

<!-- question -->
Considering caching RSS feeds. Is there any option to configure cocoon 
(maybe by using some source protocol?) to behave like proxy and do not 
break the rest of cached content?
Any chance to do conditional gets? [1]

[1] http://fishbowl.pastiche.org/2002/10/21/http_conditional_get_for_rss_hackers

<!-- -->

I know there are some proxy generators around (see blocks/proxy for GenericProxyGenerator.java,
HttpProxyGenerator.java, WebServiceProxyGenerator.java), though I haven't had a close look
at them, but I don't think they do what you want them to do, and probably you haven't configured
them anyway. I am not familiar with the cocoon portal block, but I would guess that xml from
different locations is frequently generated over there (and thus one expects this to be done
effectively...anybody who knows more about this?)

Anyway, by default, when generating an external rss, you will probably have something like:

<map:generate src="http://www.foo.org/rss"/>

Unless you have specified a specific source-factories for http in your cocoon.xconf, this
uri scheme will be managed by org.apache.excalibur.source.impl.URLSourceFactory. I am not
sure if this URLSourceFactory delivers what you want, but I don't think so. Then again, I
think when working with external (http) sources like rss, one wants the following behavior:

1) fetch the rss/xml only when modified
2) have the fetched content cached
3) have the fetched content cached with an expires (configurable)
4) have a configurable connection time out (we had cocoon apps completely hanging on failing
external rss feeds)

I think cocoon is particularly well suited for xml content aggregation of external sources,
but I haven't found something that managed the above 4 (logical) demands for it (Anybody....is
it in there already? Portal block? Do I miss something?)

I want this behavior to have the following behavior:

Fetching rss/xml only when modified is clear I suppose. Caching the content is necessary to
have the content available when you get a 304. You might have an external rss that lacks proper
headers, making your generator having to regenerate the rss every time. Therefore, making
the cache also expiring gives you the possibility to only ask for the rss every X minutes.
At last, quite an important one, a failing external rss can make your cocoon app waiting very
long for the external source, resulting in a broken front end.

I think it is not very hard to make an external rss/xml generator that meets these 4 points
(or adding a http scheme to the source-factories with a component-instance that behaves accordingly),
but I am afraid that it must be in cocoon somewhere already...

Regards Ard






---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message