hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/Troubleshooting" by RongEnFan
Date Sun, 10 Aug 2008 15:49:40 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by RongEnFan:
http://wiki.apache.org/hadoop/Hbase/Troubleshooting

------------------------------------------------------------------------------
  === Resolution ===
   * Either reduce the load or add more memory/machines.
  
+ 
+ == Problem: Master initializes, but Region Servers do not ==
+  * Master's log contains repeated instances of the following block:
+   ~-INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:60020. Already
tried 1 time(s).[[BR]]
+   INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:60020. Already
tried 2 time(s).[[BR]]
+   ...[[BR]]
+   INFO org.apache.hadoop.ipc.RPC: Server at /127.0.0.1:60020 not available yet, Zzzzz...-~
+  * Region Servers' logs contains repeated instances of the following block:
+   ~-INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:60000.
Already tried 1 time(s).[[BR]]
+   INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:60000.
Already tried 2 time(s).[[BR]]
+   ...[[BR]]
+   INFO org.apache.hadoop.ipc.RPC: Server at masternode/192.168.100.50:60000 not available
yet, Zzzzz...-~
+  * Note that the Master believes the Region Servers have the IP of 127.0.0.1 - which is
localhost and resolves to the master's own localhost.
+ === Causes ===
+  * The Region Servers are erroneously informing the Master that their IP addresses are 127.0.0.1.
+ === Resolution ===
+  * Modify '''/etc/hosts''' on the region servers, from
+   {{{
+ # Do not remove the following line, or various programs
+ # that require network functionality will fail.
+ 127.0.0.1		fully.qualified.regionservername regionservername  localhost.localdomain localhost
+ ::1		localhost6.localdomain6 localhost6
+ }}}
+ 
+  * To (removing the master node's name from localhost)
+   {{{
+ # Do not remove the following line, or various programs
+ # that require network functionality will fail.
+ 127.0.0.1		localhost.localdomain localhost
+ ::1		localhost6.localdomain6 localhost6
+ }}}
+ == Problem: Created Root Directory for HBase through Hadoop DFS ==
+  * On Startup, Master says that you need to run the hbase migrations script. Upon running
that, the hbase migrations script says no files in root directory.
+ === Causes ===
+  * HBase expects the root directory to either not exist, or to have already been initialized
by hbase running a previous time. If you create a new directory for HBase using Hadoop DFS,
this error will occur.
+ === Resolution ===
+  * Make sure the HBase root directory does not currently exist or has been initialized by
a previous run of HBase. Sure fire solution is to just use Hadoop dfs to delete the HBase
root and let HBase create and initialize the directory itself.
+ 
+ == Problem: lots of DFS error regarding can not find block from live nodes ==
+  * Under heavy read load, you may see lots of DFSClient complains about no live nodes hold
a particular block. And, hadoop DataNode logs show xceiverCount exceed the limit (256)
+ 
+ === Causes ===
+  * not enough xceiver thread on DataNode to serve the traffic
+ 
+ === Resolution ===
+  * Either reduce the load or set dfs.datanode.max.xcievers (hadoop-site.xml) to a larger
value than the default (256). Note that in order to change the tunable, you need 0.17.2 or
0.18.0 (HADOOP-3859).
+ 

Mime
View raw message