hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/Troubleshooting" by AndrewPurtell
Date Wed, 16 Sep 2009 18:14:06 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by AndrewPurtell:

   * See an exception with above message in logs (usually hadoop 0.18.x).
  === Causes ===
   * Slow datanodes are marked as down by DFSClient; eventually all replicas are marked as
'bad' (HADOOP-3831).
+  * Insufficient file descriptors available at the OS level for DFS DataNodes.
  === Resolution ===
+  * Increase the file descriptor limit of the user account under which the DFS DataNode processes
are operating. On most Linux systems, adding the following lines to /etc/security/limits.conf
will increase the file descriptor limit from the default of 1024 to 32768. Substitute the
actual user name for {{{<user>}}}. 
+    {{{
+ <user>          soft    nofile          32768
+ <user>          hard    nofile          32768
+ }}}
-  * Apply HADOOP-4681 to your cluster or at least to the hadoop jar used by hbase.
+  * Apply HDFS-127 (formerly HADOOP-4681) to your cluster or at least to the hadoop jar used
by hbase.
   * Try setting '''dfs.datanode.socket.write.timeout''' to zero (in hadoop 0.18.x -- See
HADOOP-3831 for detail and why not needed in hadoop 0.19.x).  See the thread at [http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/%3C20126171.post@talk.nabble.com%3E
message from jean-adrien] for some background.  Note, this is an hdfs client configuration
so needs to be available in $HBASE_HOME/conf.  Making the change only in $HADOOP_HOME/conf
is not sufficient.  Copy your amended hadoop-site.xml to the hbase conf directory or add this
configuration to $HBASE_HOME/conf/hbase-site.xml.
   * Try increasing '''dfs.datanode.handler.count''' from its default of 3. This is a server
configuration change so must be made in $HADOOP_HOME/conf/hadoop-site.xml. Try increasing
it to 10, then by additional increments of 10. It probably does not make sense to use a value
larger than the total number of nodes in the cluster. 

View raw message