hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/Troubleshooting" by AndrewPurtell
Date Wed, 14 Jan 2009 08:51:49 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by AndrewPurtell:
http://wiki.apache.org/hadoop/Hbase/Troubleshooting

The comment on the change is:
Added #7 Problem: DFS instability and/or regionserver lease timeouts

------------------------------------------------------------------------------
   1. [#4 Problem: On migration, no files in root directory]
   1. [#5 Problem: "xceiverCount 258 exceeds the limit of concurrent xcievers 256"]
   1. [#6 Problem: "No live nodes contain current block"]
+  1. [#7 Problem: DFS instability and/or regionserver lease timeouts]
  
  [[Anchor(1)]]
  == Problem: Master initializes, but Region Servers do not ==
@@ -66, +67 @@

  === Causes ===
   * RPC timeouts may happen because of a IO contention which blocks processes during file
swapping.
  === Resolution ===
-  * Eith
  
  [[Anchor(4)]]
  == Problem: On migration, no files in root directory ==
@@ -103, +103 @@

   * Try setting '''dfs.datanode.socket.write.timeout''' to zero.  See the thread at [http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/%3C20126171.post@talk.nabble.com%3E
message from jean-adrien] for some background.  Note, this is an hdfs client configuration
so needs to be available in $HBASE_HOME/conf.  Making the change only in $HADOOP_HOME/conf
is not sufficient.  Copy your amended hadoop-site.xml to the hbase conf directory or add this
configuration to $HBASE_HOME/conf/hbase-site.xml.
   * Try increasing '''dfs.datanode.handler.count''' from its default of 3. This is a server
configuration change so must be made in $HADOOP_HOME/conf/hadoop-site.xml. Try increasing
it to 10, then by additional increments of 10. It probably does not make sense to use a value
larger than the total number of nodes in the cluster. 
  
+ 
+ [[Anchor(7)]]
+ == Problem: DFS instability and/or regionserver lease timeouts ==
+  * HBase regionserver leases expire during start up
+  * HBase daemons cannot find block locations in HDFS during start up or other periods of
load
+ === Causes ===
+  * Excessive connection establishment latency (HRPC sets up connections on demand)
+  * Slow host name resolution
+  * Network bandwidth overcommitment
+ === Resolution ===
+  * Insure that host name resolution latency is low, or use static entries in /etc/hosts
+  * Monitor the network and insure that adequate bandwidth is available for HRPC transactions
+ 

Mime
View raw message