forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ross Gardler <rgard...@apache.org>
Subject Re: Add support for Googles sitemap protocol?
Date Wed, 13 Jul 2005 21:16:59 GMT
Rasik Pandey wrote:
> Ross Gardler wrote:
> 
>>> Ferdinand Soethe wrote:
>>> Good point. However, I don't think OAI has a "minimal" form, I did some 
>>> preliminary research into it a few months ago. Let me check it out, I'll 
> 
>>> report back.
>>>
>>> However, I'd still like to see support for Google sitemaps since we can 
>>> do it very quickly and it is more "approachable" than OAI since everyone 
> 
>>> knows Google.
>>>
>>> If we go for the Google format, I'd like to suggest to use slightly
>>> more than the minimum format in this form (as documented in
>>> 
> https://www.google.com/webmasters/sitemaps/docs/en/protocol.html)
>>> 
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <urlset xmlns="
> http://www.google.com/schemas/sitemap/0.84" <http://www.google.com/schemas/sitemap/0.84">>
>>>    <url>
>>>       <loc>http://www.yoursite.com/catalog?item=83&amp;desc=vacation_usa
>  <http://www.yoursite.com/catalog?item=83&amp;desc=vacation_usa></loc>
>>>       <lastmod>2004-11-23</lastmod>
>>>    </url>
>>> </urlset>
>>> 
>>> and include the 'lastmod' right away as that would be the key to speedy
> 
>>> updates. Can we do that?
> 
> Why not use rss2.0 as the format http://www.google.com/webmasters/sitemaps/docs/en/other.html#feed
>  ?

It's not the format of the document that is a problem, that part is 
easy. The hard part is knowing when the page has been regnerated because 
of a change.

>> I'd recomend getting the minimal done, then looking at a way of getting 
>> the lastmod as well.
> 
> What do you consider the minimal? In rss <pubDate> and <link> ?

The minimum required by Google, i.e those marked requried in the following:

http://www.google.com/webmasters/sitemaps/docs/en/protocol.html#xmlTagDefinitions

(or if we used RSS instead whatever is required in that format).

>>> Did you see that Google wants the urls to be url encoded? Does our
> 
>>> XSLT-engine have a function for that?
>>
>> http://www.exslt.org/str/functions/encode-uri/index.html
> 
> Why not use the 
> http://cocoon.apache.org/2.1/userdocs/transformers/encodeurl-transformer.html?

Why not indeed. Thanks for the pointer.

Ross


Mime
View raw message