nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <>
Subject Re: Nutch Hadoop Optimization
Date Thu, 15 Dec 2011 19:01:11 GMT
Well, if performance is low its likely not a Hadoop issue. Hadoop tuning is 
only required if you start pushing it to limits.

I would indeed check the Nutch wiki. There are important settings such as 
threads, queues etc that are very important.

> This is overwhelmingly weighted towards Hadoop configuration.
> There are some guidance notes on the Nutch wiki for performance issues
> so you may wish to give them a try first.
> On Thu, Dec 15, 2011 at 4:22 PM, Bai Shen <> wrote:
> > So I have Nutch running on a hadoop cluster with three data nodes.  The
> > machines are all pretty beefy, but Nutch isn't performing any faster than
> > when I was running in pseudo mode on one machine.
> > 
> > How to I set Nutch in order to take full advantage of the cluster?
> > 
> > Thanks.

  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message