lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Generating a sitemap
Date Thu, 18 Mar 2010 22:25:15 GMT

: Been testing nutch to crawl for solr and I was wondering if anyone had
: already worked on a system for getting the urls out of solr and generating
: an XML sitemap for Google.

it's pretty easy to just paginate through all docs in solr, so you could 
do that -- but I'd be really suprised if Nutch wasn't also loggign all the 
URLs it indexed, so you could just post-process that log to build the 
sitemap as well.



-Hoss


Mime
View raw message