hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kumar Vavilapalli <vino...@apache.org>
Subject Re: Resource limits with Hadoop and JVM
Date Mon, 16 Sep 2013 21:04:17 GMT
I assume you are on Linux. Also assuming that your tasks are so resource intensive that they
are taking down nodes. You should enable limits per task, see http://hadoop.apache.org/docs/stable/cluster_setup.html#Memory+monitoring

What it does is that jobs are now forced to up front provide their resource requirements,
and TTs enforce those limits.

HTH
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Sep 16, 2013, at 1:35 PM, Forrest Aldrich wrote:

> We recently experienced a couple of situations that brought one or more Hadoop nodes
down (unresponsive).   One was related to a bug in a utility we use (ffmpeg) that was resolved
by compiling a new version. The next, today, occurred after attempting to join a new node
to the cluster.   
> 
> A basic start of the (local) tasktracker and datanode did not work -- so based on reference,
I issued:  hadoop mradmin -refreshNodes, which was to be followed by hadoop dfsadmin -refreshNodes.
   The load average literally jumped to 60 and the master (which also runs a slave) became
unresponsive.
> 
> Seems to me that this should never happen.   But, looking around, I saw an article from
Spotify which mentioned the need to set certain resource limits on the JVM as well as in the
system itself (limits.conf, we run RHEL).    I (and we) are fairly new to Hadoop, so some
of these issues are very new.
> 
> I wonder if some of the experts here might be able to comment on this issue - perhaps
point out settings and other measures we can take to prevent this sort of incident in the
future.
> 
> Our setup is not complicated.   Have 3 hadoop nodes, the first is also a master and a
slave (has more resources, too).   The underlying system we do is split up tasks to ffmpeg
 (which is another issue as it tends to eat resources, but so far with a recompile, we are
good).   We have two more hardware nodes to add shortly.
> 
> 
> Thanks!


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message