lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominique Bejean <dominique.bej...@eolya.fr>
Subject Re: Crawl Anywhere -
Date Wed, 22 May 2013 13:11:26 GMT
Hi,

I didn't see this question.

Yes, I confirm Crawl-Anywhere can crawl in distributed environment.
If you have several huge web sites to crawl, you can dispatch crawling 
across several crawler engines. However, one single web site can only be 
crawled by one crawler engine at a time.
This limitation should be removed in future version.

For your information, new version 4.0.0 is now available as an 
open-source project hosted on Github - 
https://github.com/bejean/crawl-anywhere

Regards.




Le 11/02/13 12:02, O. Klein a écrit :
> Yes you can run CA on different machines.
>
> In "Manage" you have to set target and engine for this to work.
>
> I've never done this, so you have to contact the developer for more details.
>
>
>
> SivaKarthik wrote
>> Hi All,
>>   in our project, we need to download around millions of pages...
>>   so is there any support to do the crawling in distributed environment
>> using crawl-anywhere apps?
>>    or wat could be the alternatives...?
>>
>>   Thanks in advance..
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/ANNOUNCE-Web-Crawler-tp2607831p4039674.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

-- 
Dominique Béjean
+33 6 08 46 12 43
skype: dbejean
www.eolya.fr
www.crawl-anywhere.com
www.mysolrserver.com


Mime
View raw message