hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mbau...@apache.org
Subject svn commit: r1301784 - in /hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase: HConstants.java HServerLoad.java master/HMaster.java master/ServerManager.java regionserver/HRegionServer.java util/Rack.java
Date Fri, 16 Mar 2012 22:05:01 GMT
Author: mbautin
Date: Fri Mar 16 22:05:01 2012
New Revision: 1301784

URL: http://svn.apache.org/viewvc?rev=1301784&view=rev
Log:
[master] Master's RS monitor

Summary:
Master will kill a region server if it fails to send heartbeats for long enough. This is the
fast way to kill a stuck region server.

If master finds too many nodes have become inaccessible in the same rack then it doesn't fast-kill
them. It waits for the rack to recover. If the rack doesn't recover then the zk based RS expiry
will kick in.

Two new config params introduced
    conf.getInt("hbase.region.server.max.early.expired.per.rack", 1);
    conf.getInt("hbase.region.server.inactivity.short.timeout", 10000);

Test Plan:
verfied that machines are being correctly grouped into racks

===

single server test

master quickly recognized the unresponsive server but didn't kill it immediately
2012-03-05 15:43:05,487 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server serverName=<hostname>,60020,1330990313077,
load=(requests=0, regions=11, usedHeap=153, maxHeap=23987) timed out. Will wait 9988ms for
others in rack before expiring it

Master expires the servers after 10s
2012-03-05 15:43:15,493 WARN org.apache.hadoop.hbase.master.ServerManager: Expiring Server
serverName=<hostname>,60020,1330990313077, load=(requests=0, regions=11, usedHeap=153,
maxHeap=23987) because of missed load reports

restore connectivity to the sever. the region sever receives a STOP directive from the master
012-03-05 15:44:21,990 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Attempt=6
java.net.SocketTimeoutException: 5000 millis timeout while waiting for channel to be ready
for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.30.253.200:60000]
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:310)
        at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:867)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:734)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:262)
        at $Proxy0.regionServerReport(Unknown Source)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:527)
        at java.lang.Thread.run(Thread.java:619)
2012-03-05 15:44:26,000 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGIONSERVER_STOP

===

Three servers one 1 rack failing test. increased zk session to to 5 mins.

master detects too many servers dead and doesn't do anything. zk nodes have not expired.
2012-03-05 15:58:35,984 INFO
org.apache.hadoop.hbase.master.ServerManager: Too many servers count=3 timed out in rack /<rack>.
. Timed out Servers = [serverName=<hostname>,60020,1330990278361, load=(requests=0,
regions=11, usedHeap=149, maxHeap=23987), serverName=<hostname>,60020,1330991115760,
load=(requests=0, regions=12, usedHeap=67, maxHeap=23987), serverName=<hostname>,60020,1330990235580,
load=(requests=0, regions=11, usedHeap=149, maxHeap=23987)]. Not expiring these servers, hoping
for rack to become accessible
2012-03-05 15:58:40,987 INFO
org.apache.hadoop.hbase.master.ServerManager: Too many servers count=3 timed out in rack /<rack>.
. Timed out Servers = [serverName=<hostname>,60020,1330990278361, load=(requests=0,
regions=11, usedHeap=149, maxHeap=23987), serverName=<hostname>,60020,1330991115760,
load=(requests=0, regions=12, usedHeap=67, maxHeap=23987), serverName=<hostname>,60020,1330990235580,
load=(requests=0, regions=11, usedHeap=149, maxHeap=23987)]. Not expiring these servers, hoping
for rack to become accessible
2012-03-05 15:58:45,583 INFO org.apache.hadoop.hbase.master.ServerManager: 89 region servers,
0 dead, average load 11.55056179775281

I restore connectivity to all these severs. Notice that the last server to become accessible
is not killed.

2012-03-05 15:59:16,006 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server serverName=<hostname>,60020,1330990278361,
load=(requests=0, regions=11, usedHeap=149, maxHeap=23987) timed out. Will wait 9940ms for
others in rack before expiring it
2012-03-05 15:XX:XX DEBUG org.apache.hadoop.hbase.master.ServerManager: Restarted receiving
reports from server <hostname>,60020,1330990278361, load=(

The region servers keep trying to send reports but they don't die
012-03-05 15:40:46,607 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Attempt=58
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:310)
        at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:867)
a       at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:734)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:262)
        at $Proxy0.regionServerReport(Unknown Source)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:527)
        at java.lang.Thread.run(Thread.java:619)
2012-03-05 15:41:37,063 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=118.06349MB (123798544), Free=14274.287MB (14967674864), Max=14392.351MB (15091473408),
Counts: Blocks=0, Access=22, Hit=0, Miss=22, cachingAccesses=0, cachingHits=0, Evictions=0,
Evicted=0, Ratios: Hit Ratio=0.0%, Miss Ratio=100.0%, Evicted/Run=NaN

===

servers expiring on 2 racks. 3 servers each. one rack has 5 min zk session expiry and another
has 1 min.

012-03-05 16:24:52,487 INFO
org.apache.hadoop.hbase.master.ServerManager: Too many servers count=3 timed out in rack /<rack>.
. Timed out Servers = [serverName=<hostname>,60020,1330990278361, load=(requests=0,
regions=11, usedHeap=156, maxHeap=23987), serverName=<hostname>,60020,1330991115760,
load=(requests=0, regions=12, usedHeap=78, maxHeap=23987), serverName=<hostname>,60020,1330990235580,
load=(requests=0, regions=11, usedHeap=156, maxHeap=23987)]. Not expiring these servers, hoping
for rack to become accessible
2012-03-05 16:24:57,491 INFO
org.apache.hadoop.hbase.master.ServerManager: Too many servers count=3 timed out in rack /<rack>.
. Timed out Servers = [serverName=<hostname>,60020,1330985785026, load=(requests=0,
regions=11, usedHeap=161, maxHeap=23987), serverName=<hostname>,60020,1330142550846,
load=(requests=0, regions=11, usedHeap=4992, maxHeap=23987), serverName=<hostname>,60020,1330140718667,
load=(requests=0, regions=11, usedHeap=13599, maxHeap=23987)]. Not expiring these servers,
hoping for rack to become accessible

One node's zk expired. This had 5 min zk session to!!

2012-03-05 16:25:03,005 INFO org.apache.hadoop.hbase.master.ServerManager: <hostname>,60020,1330991115760
znode expired

otherwise everything looks OK

Reviewers: kannan, liyintang, kranganathan, rthiessen

Reviewed By: kannan

CC: kranganathan

Differential Revision: https://phabricator.fb.com/D420784

Added:
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/Rack.java
Modified:
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HConstants.java
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java

Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HConstants.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HConstants.java?rev=1301784&r1=1301783&r2=1301784&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HConstants.java (original)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HConstants.java Fri Mar 16
22:05:01 2012
@@ -453,6 +453,8 @@ public final class HConstants {
   public static final String DISTRIBUTED_LOG_SPLITTING_KEY =
       "hbase.master.distributed.log.splitting";
 
+  public static final int REGION_SERVER_MSG_INTERVAL = 1 * 1000;
+
   private HConstants() {
     // Can't be instantiated with this ctor.
   }

Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HServerLoad.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HServerLoad.java?rev=1301784&r1=1301783&r2=1301784&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HServerLoad.java (original)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/HServerLoad.java Fri Mar
16 22:05:01 2012
@@ -49,6 +49,10 @@ public class HServerLoad implements Writ
   private int maxHeapMB;
   /** per-region load metrics */
   private ArrayList<RegionLoad> regionLoad = new ArrayList<RegionLoad>();
+  // lastLoadRefreshTime and missedLastLoadReport are only maintained by the
+  // master. They are not serialized and reported by the region servers
+  public volatile /* transient */ long lastLoadRefreshTime = 0;
+  public volatile /* transient */ boolean missedLastLoadReport = false;
 
   /**
    * Encapsulates per-region loading metrics.

Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/HMaster.java?rev=1301784&r1=1301783&r2=1301784&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/HMaster.java (original)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/HMaster.java Fri Mar
16 22:05:01 2012
@@ -289,10 +289,11 @@ public class HMaster extends Thread impl
     if (conf.getBoolean(HConstants.MASTER_TYPE_BACKUP,
         HConstants.DEFAULT_MASTER_TYPE_BACKUP)) {
       // ephemeral node expiry will be detected between about 40 to 60 seconds;
+      // (if the session timeout is set to 60 seconds)
       // plus add a little extra since only ZK leader can expire nodes, and
       // leader maybe a little  bit delayed in getting info about the pings.
       // Conservatively, just double the time.
-      int stallTime = conf.getInt("zookeeper.session.timeout", 60 * 1000) * 2;
+      int stallTime = getZKSessionTimeOutForMaster(conf) * 2;
 
       LOG.debug("HMaster started in backup mode. Stall " + stallTime +
           "ms giving primary master a fair chance to be the master...");
@@ -384,14 +385,37 @@ public class HMaster extends Thread impl
     return this.logSplitThreadPool;
   }
 
+  private int getZKSessionTimeOutForMaster(Configuration conf) {
+    int zkTimeout = conf.getInt("hbase.master.zookeeper.session.timeout", 0);
+    if (zkTimeout != 0) {
+      return zkTimeout;
+    }
+    return conf.getInt(HConstants.ZOOKEEPER_SESSION_TIMEOUT,
+        HConstants.DEFAULT_ZOOKEEPER_SESSION_TIMEOUT);
+  }
+
   private void initializeZooKeeper() throws IOException {
     boolean abortProcesstIfZKExpired = conf.getBoolean(
         HConstants.ZOOKEEPER_SESSION_EXPIRED_ABORT_PROCESS, true);
+    // Set this property to set zk session timeout for master which is different
+    // from what region servers use. The master's zk session timeout can be
+    // much shorter than region server's. It is easier to recycle master becuase
+    // it doesn't handle data. The region server can have an inflated zk session
+    // timeout because they also rely on master to kill them if they miss any
+    // heartbeat
+    int zkTimeout = conf.getInt("hbase.master.zookeeper.session.timeout", 0);
+    Configuration localConf;
+    if (zkTimeout != 0) {
+      localConf = new Configuration(conf);
+      localConf.setInt(HConstants.ZOOKEEPER_SESSION_TIMEOUT, zkTimeout);
+    } else {
+      localConf = conf;
+    }
     if (abortProcesstIfZKExpired) {
-      zooKeeperWrapper = ZooKeeperWrapper.createInstance(conf,
+      zooKeeperWrapper = ZooKeeperWrapper.createInstance(localConf,
           getZKWrapperName(), new RuntimeHaltAbortStrategy());
     } else {
-      zooKeeperWrapper = ZooKeeperWrapper.createInstance(conf,
+      zooKeeperWrapper = ZooKeeperWrapper.createInstance(localConf,
           getZKWrapperName(), new Abortable() {
 
             @Override

Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java?rev=1301784&r1=1301783&r2=1301784&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
(original)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
Fri Mar 16 22:05:01 2012
@@ -23,7 +23,9 @@ import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Arrays;
 import java.util.Collections;
+import java.util.HashMap;
 import java.util.HashSet;
+import java.util.List;
 import java.util.Map;
 import java.util.Set;
 import java.util.SortedMap;
@@ -53,6 +55,8 @@ import org.apache.hadoop.hbase.client.Re
 import org.apache.hadoop.hbase.ipc.HRegionInterface;
 import org.apache.hadoop.hbase.master.RegionManager.RegionState;
 import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.hadoop.hbase.util.EnvironmentEdgeManager;
+import org.apache.hadoop.hbase.util.Rack;
 import org.apache.hadoop.hbase.util.Threads;
 import org.apache.zookeeper.WatchedEvent;
 import org.apache.zookeeper.Watcher;
@@ -173,7 +177,9 @@ public class ServerManager {
         master.getFileSystem(), master.getOldLogDir());
     Threads.setDaemonThreadRunning(oldLogCleaner,
       n + ".oldLogCleaner");
-
+    rackInfo = new Rack(c);
+    Threads.setDaemonThreadRunning(new ServerTimeoutMonitor(c),
+        n + "ServerManager-Timeout-Monitor");
   }
 
   /**
@@ -459,24 +465,30 @@ public class ServerManager {
   throws IOException {
     // Refresh the info object and the load information
     this.serversToServerInfo.put(serverInfo.getServerName(), serverInfo);
-    HServerLoad load = this.serversToLoad.get(serverInfo.getServerName());
-    if (load != null) {
-      this.master.getMetrics().incrementRequests(load.getNumberOfRequests());
-      if (!load.equals(serverInfo.getLoad())) {
-        updateLoadToServers(serverInfo.getServerName(), load);
+    HServerLoad oldLoad = this.serversToLoad.get(serverInfo.getServerName());
+    HServerLoad newLoad = serverInfo.getLoad();
+    if (oldLoad != null) {
+      // XXX why are we using oldLoad to update metrics
+      this.master.getMetrics().incrementRequests(oldLoad.getNumberOfRequests());
+      if (!oldLoad.equals(newLoad)) {
+        updateLoadToServers(serverInfo.getServerName(), oldLoad);
       }
     }
 
     // Set the current load information
-    load = serverInfo.getLoad();
-    this.serversToLoad.put(serverInfo.getServerName(), load);
+    newLoad.lastLoadRefreshTime = EnvironmentEdgeManager.currentTimeMillis();
+    if (oldLoad!= null && oldLoad.missedLastLoadReport) {
+      LOG.info("Restarted receiving reports from server " + serverInfo);
+    }
+    newLoad.missedLastLoadReport = false;
+    this.serversToLoad.put(serverInfo.getServerName(), newLoad);
     synchronized (loadToServers) {
-      Set<String> servers = this.loadToServers.get(load);
+      Set<String> servers = this.loadToServers.get(newLoad);
       if (servers == null) {
         servers = new HashSet<String>();
       }
       servers.add(serverInfo.getServerName());
-      this.loadToServers.put(load, servers);
+      this.loadToServers.put(newLoad, servers);
     }
 
     // Next, process messages for this server
@@ -1057,4 +1069,180 @@ public class ServerManager {
     serverMonitorThread.stopThread();
   }
 
+  // should ServerTimeoutMonitor and ServerMonitor be merged XXX?
+  private class ServerTimeoutMonitor extends Thread {
+    private final Log LOG =
+        LogFactory.getLog(ServerTimeoutMonitor.class.getName());
+    int timeout;
+    int shortTimeout;
+    int maxServersToExpirePerRack;
+
+    public ServerTimeoutMonitor(Configuration conf) {
+      this.shortTimeout = Math.max(2000,
+          2 * conf.getInt("hbase.regionserver.msginterval",
+                          HConstants.REGION_SERVER_MSG_INTERVAL));
+      this.timeout =
+          conf.getInt("hbase.region.server.missed.report.timeout", 10000);
+      if (shortTimeout > timeout) {
+        timeout = shortTimeout;
+      }
+      // XXX what should be the default value of maxServersToExpirePerRack? We
+      // could set it to the max number of servers that will go down in a rack
+      // if a 'line card' in a rack switch were to go bad.
+      maxServersToExpirePerRack =
+          conf.getInt("hbase.region.server.missed.report.max.expired.per.rack",
+                      1);
+      LOG.info("hbase.region.server.missed.report.max.expired.per.rack=" +
+          maxServersToExpirePerRack);
+      LOG.info("hbase.region.server.missed.report.timeout=" +
+          timeout + "ms shortTimeout=" + shortTimeout + "ms");
+    }
+
+    @Override
+    public void run() {
+      try {
+        while (true) {
+          boolean waitingForMoreServersInRackToTimeOut =
+              expireTimedOutServers(timeout, maxServersToExpirePerRack);
+          if (waitingForMoreServersInRackToTimeOut) {
+            sleep(shortTimeout/2);
+          } else {
+            sleep(timeout/2);
+          }
+        }
+      } catch (Exception e) {
+        // even InterruptedException is unexpected
+        LOG.fatal("ServerManager Timeout Monitor thread, unexpected exception",
+            e);
+      }
+      return;
+    }
+  }
+
+  private long lastDetailedLogAt = 0;
+  private long lastLoggedServerCount = 0;
+  private HashSet<String> inaccessibleRacks = new HashSet<String>();
+  Rack rackInfo;
+  /**
+   * @param timeout
+   * @param maxServersToExpire If more than these many servers expire in a rack
+   * then do nothing. Most likely there is something wrong with rack
+   * connectivity. Wait for rack to recover or wait for region servers to
+   * loose their zk sessions (which typically should have a much longer
+   * timeout)
+   * @return true if servers timed out but were not expired because
+   * we would like to wait and see whether more servers in the rack time out or
+   * not
+   */
+  boolean expireTimedOutServers(long timeout, int maxServersToExpire) {
+    long curTime = EnvironmentEdgeManager.currentTimeMillis();
+    boolean waitingForMoreServersInRackToTimeOut = false;
+    boolean reportDetails = false;
+    if ((curTime > lastDetailedLogAt + (3600 * 1000)) ||
+        lastLoggedServerCount != serversToLoad.size()) {
+      reportDetails = true;
+      lastDetailedLogAt = curTime;
+      lastLoggedServerCount = serversToLoad.size();
+    }
+    // rack -> time of last report from rack
+    Map<String, Long> rackLastReportAtMap = new HashMap<String, Long>();
+    // rack -> list of timed out servers in rack
+    Map<String, List<HServerInfo>> rackTimedOutServersMap =
+        new HashMap<String, List<HServerInfo>>();
+    for (Map.Entry<String, HServerLoad> e : this.serversToLoad.entrySet()) {
+      HServerInfo si = this.serversToServerInfo.get(e.getKey());
+      if (si == null) continue; // server removed
+      HServerLoad load = e.getValue();
+      String rack =
+          rackInfo.getRack(si.getServerAddress().getInetSocketAddress());
+      long timeOfLastPingFromThisServer = load.lastLoadRefreshTime;
+      if (timeOfLastPingFromThisServer <= 0 ) {
+        // invalid value implies that the master has discovered the rs
+        // but hasn't yet had the first report from the rs. It is usually
+        // in the master failover path. It might be a while before the rs
+        // discovers the new master and starts reporting to the new master
+        continue;
+      }
+      Long timeOfLastPingFromThisRack = rackLastReportAtMap.get(rack);
+      if (timeOfLastPingFromThisRack == null ||
+          (timeOfLastPingFromThisServer > timeOfLastPingFromThisRack)) {
+        // this is ok to do even if load.missedLastLoadReport is true
+        rackLastReportAtMap.put(rack, timeOfLastPingFromThisServer);
+      }
+      if (curTime <= timeOfLastPingFromThisServer + timeout) {
+        if (reportDetails) {
+          LOG.debug("rack=" + rack + ", recently heard from server=" +
+              si.getServerName());
+        }
+        if (load.missedLastLoadReport) {
+          waitingForMoreServersInRackToTimeOut = true;
+        }
+        continue; // not expired
+      }
+      List<HServerInfo> timedOutServersInThisRack =
+          rackTimedOutServersMap.get(rack);
+      if (timedOutServersInThisRack == null) {
+        timedOutServersInThisRack = new ArrayList<HServerInfo>();
+        rackTimedOutServersMap.put(rack, timedOutServersInThisRack);
+      }
+      timedOutServersInThisRack.add(si);
+    }
+    // In rackTimedOutServersMap[rack] we have all the expired servers in the
+    // rack
+    // In rackLastReportAtMap[rack] we have the time when the last report from
+    // this rack was received
+    for (Map.Entry<String, List<HServerInfo>> e:
+      rackTimedOutServersMap.entrySet()) {
+      String rack = e.getKey();
+      List<HServerInfo> timedOutServers = e.getValue();
+      if (timedOutServers.size() > maxServersToExpire) {
+        // Too many servers timed out in this rack. We expect something is
+        // wrong with the rack and not with the servers. We will not expire
+        // these servers. We will pretend as if they never timed out - set the
+        // load.missedLastReport to false. We could have also reset the
+        // load.lastLoadRefreshTime but we don't do that
+        for (HServerInfo si : timedOutServers) {
+          HServerLoad load = serversToLoad.get(si.getServerName());
+          if (load == null) continue; //server vanished
+          load.missedLastLoadReport = false;
+        }
+        if (!inaccessibleRacks.contains(rack)) {
+          inaccessibleRacks.add(rack);
+          LOG.info("Too many servers count=" + timedOutServers.size() +
+              " timed out in rack " + rack + ". . Timed out Servers = " +
+              timedOutServers + ". Not expiring these servers, hoping for rack" +
+              " to become accessible");
+        }
+        continue; // next rack
+      }
+      if (inaccessibleRacks.contains(rack)) {
+        LOG.info("rack " + rack + "has become accessible");
+      }
+      inaccessibleRacks.remove(rack);
+      long lastHeardFromRackAt = rackLastReportAtMap.get(rack);
+      for (HServerInfo si : timedOutServers) {
+        HServerLoad load = serversToLoad.get(si.getServerName());
+        if (load == null) continue; //server vanished
+        if (load.missedLastLoadReport) {
+          LOG.warn("Expiring Server " + si + " because of missed load reports");
+          // Since the server load is fetched again, therefore it might have
+          // changed since we last read it. We do look at
+          // load.missedLastLoadReport one more time but that isn't enough
+          // guarantee that we will not expire a server that has just reported.
+          this.expireServer(si);
+        } else {
+          // wait for some more time to make sure that no other server in the
+          // rack becomes inaccessible. Also advance the timeout to the timeout
+          // of the last server from the rack to send a report
+          LOG.debug("Server " + si + " timed out. Will wait " +
+          (lastHeardFromRackAt + timeout - curTime) + "ms for others in" +
+              " rack before expiring it");
+          load.missedLastLoadReport = true;
+          load.lastLoadRefreshTime = lastHeardFromRackAt;
+          waitingForMoreServersInRackToTimeOut = true;
+        }
+      }
+    }
+    return waitingForMoreServersInRackToTimeOut;
+  }
 }

Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java?rev=1301784&r1=1301783&r2=1301784&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
(original)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Fri Mar 16 22:05:01 2012
@@ -313,7 +313,8 @@ public class HRegionServer implements HR
     this.numRetries =  conf.getInt("hbase.client.retries.number", 2);
     this.threadWakeFrequency = conf.getInt(HConstants.THREAD_WAKE_FREQUENCY,
         10 * 1000);
-    this.msgInterval = conf.getInt("hbase.regionserver.msginterval", 1 * 1000);
+    this.msgInterval = conf.getInt("hbase.regionserver.msginterval",
+        HConstants.REGION_SERVER_MSG_INTERVAL);
 
     sleeper = new Sleeper(this.msgInterval, this.stopRequested);
 

Added: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/Rack.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/Rack.java?rev=1301784&view=auto
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/Rack.java (added)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/Rack.java Fri Mar 16
22:05:01 2012
@@ -0,0 +1,43 @@
+package org.apache.hadoop.hbase.util;
+
+import java.lang.reflect.Constructor;
+import java.net.InetSocketAddress;
+import java.util.Arrays;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hbase.regionserver.wal.HLog;
+import org.apache.hadoop.net.DNSToSwitchMapping;
+import org.apache.hadoop.net.IPv4AddressTruncationMapping;
+
+public class Rack {
+  static final Log LOG = LogFactory.getLog(Rack.class);
+  private DNSToSwitchMapping switchMapping;
+  public Rack(Configuration conf) {
+    Class<DNSToSwitchMapping> clz = (Class<DNSToSwitchMapping>)
+        conf.getClass("hbase.util.ip.to.rack.determiner",
+        IPv4AddressTruncationMapping.class);
+    try {
+      switchMapping = clz.newInstance();
+    } catch (InstantiationException e) {
+      LOG.warn("using IPv4AddressTruncationMapping, failed to instantiate " +
+          clz.getName(), e);
+    } catch (IllegalAccessException e) {
+      LOG.warn("using IPv4AddressTruncationMapping, failed to instantiate " +
+          clz.getName(), e);
+    }
+    if (switchMapping == null) {
+      switchMapping = new IPv4AddressTruncationMapping();
+    }
+  }
+
+  public String getRack(InetSocketAddress addr) {
+    String rack = switchMapping.resolve(Arrays.asList(
+        new String[]{addr.getAddress().getHostAddress()})).get(0);
+    if (rack != null && rack.length() > 0) {
+      return rack;
+    }
+    return "unknown";
+  }
+}



Mime
View raw message