hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kazuki Ohta <kazuki.o...@gmail.com>
Subject Re: massive zk expirations under heavy network load
Date Thu, 21 Apr 2011 02:24:39 GMT
Hi, All

Thanks for the helpful comments!
Nice to see this happens rarely in other environments.

Actually I've changed the configuration not to run the task on the master node,
but the same problem happened.

So at first, upgrade the switch. Report again if the problem will be fixed.


On Thu, Apr 21, 2011 at 5:32 AM, Gary Helmling <ghelmling@gmail.com> wrote:
>> I'm now using CDH3u0 at 16nodes cluster (hdp0-hdp15).
>> The configuraiton is below.
>> hdp0: zk + master + region + nn + dn + jt + tt
>> hdp1: zk + master + region + snn + dn + tt
>> hdp2: zk + region + dn + tt
>> hdp3 to hdp15: region + dn + tt
> I would also look at the memory configuration for your servers and the
> amount of heap allocated to each process.  Is it possible hdp0 is swapping
> when running a MR job?  Swapping will cause big headaches and is often a
> culprit for zk session timeouts.
> Between the 7 processes it has plus any child tasks started, it's not hard
> to picture overcommitting memory.
> Regardless of whether the core problem lies in network hardware or here, I
> would remove the region server, data node, and task tracker processes from
> hdp0 and hdp1 for smoother operation.
> --gh

Kazuki Ohta: http://kzk9.net/
CTO at Preferred Infrastructure: http://preferred.jp/

View raw message