hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramkrishna.S.Vasudevan" <ramkrishna.vasude...@huawei.com>
Subject RE: Max xceiver config
Date Thu, 22 Mar 2012 13:31:18 GMT
Your article is too good.

I think Doug can add a link to this document and also update in the book as
how to configure the 
Max xceiver config.

Regards
Ram

> -----Original Message-----
> From: Lars George [mailto:lars.george@gmail.com]
> Sent: Thursday, March 22, 2012 12:29 PM
> To: dev@hbase.apache.org; lakshman.ch@huawei.com
> Subject: Re: Max xceiver config
> 
> Hi Laxman,
> 
> Did you see (sorry for the plug)
> http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html - it
> might help determining the number.
> 
> Lars
> 
> On Mar 22, 2012, at 6:43 AM, Laxman wrote:
> 
> > HBase book is recommending to set xceiver
> count[dfs.datanode.max.xcievers]
> > to 4096
> > http://hbase.apache.org/book.html#hadoop
> >
> > Why do we need to have xceivers count so high as 4096?
> >
> > This means each Datanode in cluster is allowing the maximum of
> >  - 4096 threads with each thread occupying some memory
> >  - 4096 threads read/write to the disk(s) simultaneously
> >
> > This actually makes the system more vulnerable (kind of DOS attacks)
> by
> > over-utilization of the system resources.
> >
> > Also, this recommendation was based on some issue reported on Hadoop
> 0.18.
> > IMO, we should not have such high value as recommendation/default
> value and
> > this value to be tuned as per the capacity requirements.
> >
> > Related issues
> > ==============
> > HDFS-162
> >  - Reported on 0.18
> >  - Raising xciever count to high value caused other problems.
> >  - Resolution "Cannot Reproduce "
> >
> > HDFS-1861
> >  - Modified the default value to 4096
> >  - Source
> > http://ccgtech.blogspot.in/2010/02/hadoop-hdfs-deceived-by-
> xciever.html
> > which again refers to HDFS-162 (Reported on 0.18).
> >
> > Case study
> > ==========
> > http://lucene.472066.n3.nabble.com/Blocks-are-getting-corrupted-
> under-very-h
> > igh-load-tc3527403.html
> > In one of our production environment, this value has been set to 4096
> and
> > disk waits were very huge due to which some processes were not
> responding.
> > Also OS is configured to reboot (kernel panic reboot) when some
> process is
> > not responding for a specific amount of time.
> >
> > These two configurations has resulted in corrupted data.
> > --
> > Regards,
> > Laxman
> >
> >
> >


Mime
View raw message