hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: configs for large clusters
Date Wed, 13 Jun 2007 19:43:27 GMT
Note that Nigel recently added some to the FAQ on the wiki:

http://wiki.apache.org/lucene-hadoop/FAQ#head-1b2c093275a1a8a7e7068de941b776fcceafbf44

It would be good to understand better which of these make significant 
differences and which do not.  And how many of these should we make the 
default?  And could some of these be set automatically, based on other 
cluster properties?

Doug

Richard wrote:
> I couldn't agree more.  There are quite a portion of questions that are relating to configuration
more or less.  though there are pages explaining how (which is important), it would make things
even easier if there were more various examples.
> 
> 
> 
> Bwolen Yang <wbwolen@gmail.com> wrote: Hi,
> 
> As a newbie to Hadoop, I have being wondering what's the best way to
> configure my cluster, especially as one scales up.    After seeing
> Doug's update to sort 900 performance, it occured to me that it may be
> helpful to others to see configuration files examples, espeically for
> large clusters.  Furthermore, if we can diff against the
> configurations over time (and/or releases), we may be able to see how
> Hadoop developers tune their own clusters (and hence follow suit :).
>  Could the configs along with rough cluster specs be posted somewhere
> on hadoop's website?  And perhaps encourage others (with different
> system setups) to post similarly?
> 
> I'm also interested in seeing how people tune their clusters for
> different kind of machines  (e.g, single disk machines vs 4-6 disk
> machines), and hetergenous systems (different CPU power, disk size,
> memory size...etc).   The hetergenous part arises for people who are
> resource strapped and basically tried hard to put together a sizeable
> system with whatever machines they have got.     In my case, bad
> config can hurt as I add news machines (e.g., a machine with small
> disk, fills up quicker and task scheduled there tend to die).
> 
> thanks
> 
> bwolen
> 
> 
> 
> Best Regards
> 
> Richard Yang
> richardyang@richardyang.net
> kusanagiyang@yahoo.com

Mime
View raw message