hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ashish vyas <mailashishv...@gmail.com>
Subject Performance improvement-Cluster vs Pseudo
Date Fri, 30 Mar 2012 08:30:19 GMT
>
> Hi,
>
>
> I have setup hadoop clutser(2 node cluster) and I am running Nutch crawl
> on it. I am trying to compare results and improvement in processing time
> when I crawl with 10 URL’s and depth 2. When I am running the crawl on
> cluster its taking more time than pseudo cluster which in turn is taking
> more time than standalone nutch crawl.
>
> I am just wondering that after running Nutch on hadoop cluster processing
> time should come down logicaly since that’s why hadoop has evolved out of
> Nutch project. Please let me know if there is any benchmark test for pseudo
> vs cluster and why Nutch crawl is taking more time on cluster.
>
>
>
> Please let me know if you need more info.
>
>
>
> Regards:
>
> Ashish Vyas
>

Mime
View raw message