forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ross Gardler <rgard...@apache.org>
Subject Re: Add support for Googles sitemap protocol?
Date Sun, 05 Jun 2005 12:41:42 GMT
Ferdinand Soethe wrote:
> 
> 
> 
>>The only requried information in the sitemap is the URL. This means we
>>can create the Google sitemap now, with minimal effort. Over time we can
>>enhance this by adding further meta-data once it becomes available.
> 
> 
> I'm happy to go for the Google format, I just thought that our
> commitment to standards would tip the balance towards OAI. Does OAI
> not allow for a minimal form like Googles'?

Good point. However, I don't think OAI has a "minimal" form, I did some 
preliminary research into it a few months ago. Let me check it out, I'll 
report back.

However, I'd still like to see support for Google sitemaps since we can 
do it very quickly and it is more "approachable" than OAI since everyone 
knows Google.

> If we go for the Google format, I'd like to suggest to use slightly
> more than the minimum format in this form (as documented in
> https://www.google.com/webmasters/sitemaps/docs/en/protocol.html)
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
>    <url>
>       <loc>http://www.yoursite.com/catalog?item=83&amp;desc=vacation_usa</loc>
>       <lastmod>2004-11-23</lastmod>
>    </url>
> </urlset>
> 
> and include the 'lastmod' right away as that would be the key to speedy
> updates. Can we do that?

No. At least at present, unless you experiments with the cache-key 
thingy show otherwise. At present all files are rebuilt regardless of 
whether they have changed. The need for keeping track of last modified 
files is something that comes up fairly regularly. We all want it, but 
nowone has had a sufficient itch yet.

I'd recomend getting the minimal done, then looking at a way of getting 
the lastmod as well.

I'd like to see us being able to create meta-data with last-mod, likely 
change frequency etc. the itch is getting stronger, but it is not stong 
enough (for me personally) just yet.

> Did you see that Google wants the urls to be url encoded? Does our
> XSLT-engine have a function for that?

http://www.exslt.org/str/functions/encode-uri/index.html

NOTE the code on the above side is in the public domain we *cannot* 
relicense it as ASF, so the best thing to do is to write our own version 
of the function using replace(string,pattern,replace), the above is 
useful for quick testing, but please don't commit it to our CVS.


The replace function returns a string that is created by replacing the 
given pattern with the replace argument

Example: replace("Bella Italia", "l", "*")
Result: 'Be**a Ita*ia'
Example: replace("Bella Italia", "l", "")
Result: 'Bea Itaia'

Ross

Mime
View raw message