hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/Troubleshooting" by stack
Date Wed, 26 Nov 2008 23:12:19 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/Troubleshooting

The comment on the change is:
Add some 0.18.x troubleshooting

------------------------------------------------------------------------------
   1. [#3 Problem: Replay of hlog required, forcing regionserver restart]
   2. [#4 Problem: Master initializes, but Region Servers do not]
   1. [#5 Problem: On migration, no files in root directory]
+  1. [#6 Problem: "xceiverCount 258 exceeds the limit of concurrent xcievers 256"]
+  1. [#7 Problem: "No live nodes contain current block"]
  
  [[Anchor(1)]]
  == Problem: Master initializes, but Region Servers do not ==
@@ -116, +118 @@

  === Resolution ===
   * Either reduce the load or set dfs.datanode.max.xcievers (hadoop-site.xml) to a larger
value than the default (256). Note that in order to change the tunable, you need 0.17.2 or
0.18.0 (HADOOP-3859).
  
+ [[Anchor(6)]]
+ == Problem: "xceiverCount 258 exceeds the limit of concurrent xcievers 256" ==
+  * See an exception with above message in logs (usually hadoop 0.18.x).
+ === Causes ===
+  * An upper bound on connections was added in Hadoop (HADOOP-3633/HADOOP-3859).
+ === Resolution ===
+  * Up the maximum by setting '''dfs.datanode.max.xcievers''' (sic).  See [http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/%3C20126171.post@talk.nabble.com%3E
message from jean-adrien] for some background.
+ 
+ 
+ [[Anchor(7)]]
+ == Problem: "No live nodes contain current block" ==
+  * See an exception with above message in logs (usually hadoop 0.18.x).
+ === Causes ===
+  * Slow datanodes are marked as down by DFSClient; eventually all replicas are marked as
'bad' (HADOOP-3831).
+ === Resolution ===
+  * Try setting '''dfs.datanode.socket.write.timeout''' to zero.  See the thread at [http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/%3C20126171.post@talk.nabble.com%3E
message from jean-adrien] for some background.
+ 

Mime
View raw message