hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Datanodes going down frequently
Date Fri, 16 Sep 2011 05:35:51 GMT

On Fri, Sep 16, 2011 at 10:15 AM, john smith <js1987.smith@gmail.com> wrote:
> Hi All,
> Thanks for your inputs,
> @Aaron : No, they aren't recovering. They are losing network connectivity
> and they are not getting it back. I am unable to ssh to them and I need to
> manually go and restart the networking.

Ah so the machines itself fall off the grid? You have to 'reset' them,
hardware-wise? What state do they lie under - are they still powered
on but just unresponsive over the network? Also, only certain DNs die
out this way?

> @harsh and Raj,
> One thing I noticed in my hadoop-env.sh that  "export HADOOP_HEAPSIZE=2000"
> . Isn't this strange? Allocating my whole ram to the JVM ? Should I consider
> this? Right now I am not running any MR jobs as such .

You'll sorta need more RAM if you plan to make this into a work heavy
cluster someday. 2 GB can soon become too low, assuming your OS also
needs quite a bit of RAM for its operations.

Assuming each slave node runs only the DataNode process, using
HADOOP_HEAPSIZE=1000 should be OK to use. Else, scale it down to
700-500 or so (although that gets too low once you hit a few # of

Given that your OS needs good RAM too, and as your DN starts growing
in its blocks, you'll eventually run out of sufficient memory @ 2 GB -
so its entirely dependent on what you're gonna be doing and how much
data you'll be storing and how.

You can monitor swapping using several tools. I find 'vmstat' to be a
good one that tells me if swapping has occured. You can also setup
tools like Nagios and Ganglia across cluster for these kind of tasks.

Harsh J

View raw message