lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominique Bejean <>
Subject Re: [ANNOUNCE] Web Crawler
Date Fri, 27 May 2011 23:09:02 GMT

Sorry for the delay, but I haven't been checking the mailing list for a 
long time.

Crawl-anywhere includes 3 piece of software : a crawler, a pipeline and 
a solr indexer.

There is a default Solr schema used by Crawl-anywhere, tested with Solr 
1.4.1 and Solr 3.1.0.

But, you can configure the pipeline stage responsible for mapping 
crawled data to Solr field. IN the absolute, you can use any schema with 
any Solr version.



Le 14/05/11 15:29, abhayd a écrit :
> hi Dominique,
> I am looking for a crawler to feed solr index. After looking at various
> posts i have settled down on two
> Nutch and crawl anywhere.
> I dont see any activities on Nutch wiki so wondering if its not being
> developed anymore. But most forums say Nutch is standard for solr.
> Crawl Anywhere looks solid. Any way for users like me to decide which one we
> should go for Nutch or Crawl Anywehre?
> Concern with crawl anywhere is it supports solr 1.3 index not the latest
> version
> Any help on the is really appreciated
> --
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message