lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Will Solr/Lucene crawl multi websites (aka a mini google with faceted search)?
Date Mon, 12 Sep 2011 01:05:32 GMT
Nope, there's nothing in Solr that crawls anything, you have to feed
documents in yourself from the websites.

Or, look at the Nutch project, see:

which is designed for this kind of problem.


On Sun, Sep 11, 2011 at 8:53 PM, dpt9876 <> wrote:
> Hi all,
> I am wondering if Solr will do the following for a project I am working on.
> I want to create a search engine with facets for potentially hundreds of
> websites.
> Similar to say crawling amazon + + ebay and someone can search these
> 3 sites from my 1 website.
> (I realise there are better ways of doing the above example, its for
> illustrative purposes).
> Eventually I would build that search crawl to index say 200 or 1000
> merchants.
> Someone would come to my site and search for "digital camera".
> They would get results from all 3 indexes and hopefully dynamic facets eg
> Price $100-200
> Price 200-300
> Resolution 1mp-2mp
> etc etc
> Can this be done on the fly?
> I ask this because I am currently developing webscrapers to crawl these
> websites, dump that data into a db, then was thinking of tacking on a solr
> server to crawl my db.
> Problem with that approach is that crawling the worlds ecommerce sites will
> take forever, when it seems solr might do that for me? (I have read about
> multiple indexes etc).
> Many thanks
> --
> View this message in context:
> Sent from the Solr - User mailing list archive at

View raw message