forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rasik Pandey <rbpan...@gmail.com>
Subject Re: Add support for Googles sitemap protocol?
Date Thu, 14 Jul 2005 05:43:55 GMT
Hi Ross,
> This is a good point. How about also also providing a generator that
> would get the last modified header of remote resources. The results of
> the two could be aggregated together.
I think 
http://svn.apache.org/repos/asf/cocoon/trunk/src/java/org/apache/cocoon/generation/LinkStatusGenerator.javawould
do the trick, although it would have to be modified to make a call to
get the "last-modified" header, so hopefully we could get that added to a 
future release of cocoon. With a quick examination of the code, it looks 
like it will crawl a URL and generate an xml report, allowing includes and 
excludes expressions.
 
> However, this still is not totally robust, becayse some remote resources
> will always indicate that they have changed even when the content has
> not (for example Daisy tracks changes to meta-data that Forrest does not
> currently use).
What strategy do you propose to handle this case if any?

 >> Are you familiar with 
>> 
http://cocoon.apache.org/2.1/userdocs/generators/linkstatus-generator.html
>> , the documentation is skimpy, but it may be what we need to handle both
>> static and dynamic cases.
> No I'm not familiar. I wonder what the docs mean by "status". Will it 
 > provide the last modified header as suggested above?

See above...

> I don't have the time to experiment with it now, but I (and I am sure> 
other devs) would love to hear about your findings.

See above... 

>> I may need some assistance to know how to build in hooks from
 >> skinconf.xml to the sitemap format generation.
 > I'm not sure what you mean by that. But there are plenty of people here
> to answer your questions as they arise.

I am sure there will be a need to allow users to specify a configuration for 
this like the includes/excludes on the LinkStatusGenerator crawls and maybe 
the <changefreq> value for the google sitemap format. Can you give me a 
quick overview of how params make it from the skinconf.xml to the sitemap(s) 
or xsl(s)?


Regards,
Rus
http://www.discountdracula.com

Mime
View raw message