hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Forrest Aldrich <for...@gmail.com>
Subject Resource limits with Hadoop and JVM
Date Mon, 16 Sep 2013 20:35:09 GMT
We recently experienced a couple of situations that brought one or more 
Hadoop nodes down (unresponsive).   One was related to a bug in a 
utility we use (ffmpeg) that was resolved by compiling a new version. 
The next, today, occurred after attempting to join a new node to the 
cluster.

A basic start of the (local) tasktracker and datanode did not work -- so 
based on reference, I issued: hadoop mradmin -refreshNodes, which was to 
be followed by hadoop dfsadmin -refreshNodes.    The load average 
literally jumped to 60 and the master (which also runs a slave) became 
unresponsive.

Seems to me that this should never happen.   But, looking around, I saw 
an article from Spotify which mentioned the need to set certain resource 
limits on the JVM as well as in the system itself (limits.conf, we run 
RHEL).    I (and we) are fairly new to Hadoop, so some of these issues 
are very new.

I wonder if some of the experts here might be able to comment on this 
issue - perhaps point out settings and other measures we can take to 
prevent this sort of incident in the future.

Our setup is not complicated.   Have 3 hadoop nodes, the first is also a 
master and a slave (has more resources, too).   The underlying system we 
do is split up tasks to ffmpeg  (which is another issue as it tends to 
eat resources, but so far with a recompile, we are good).   We have two 
more hardware nodes to add shortly.


Thanks!

Mime
View raw message