accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From keith-turner <...@git.apache.org>
Subject [GitHub] accumulo issue #121: ACCUMULO-4353: Stabilize tablet assignment during trans...
Date Mon, 27 Jun 2016 18:30:23 GMT
Github user keith-turner commented on the issue:

    https://github.com/apache/accumulo/pull/121
  
    > If the tablet server returns before master notices it's gone, master will see it
as a new empty tablet server. 
    
    One possible beginning of solution is to change the behavior of `Master.gatherTabletInformation()`
to do the following :
    
     * Check if the instance matches the tablet server its talking to get status.  This could
be done by adding the instance id to the returned Thrift struct.
     * If the instance id does not match, then add tserver to `badServers`.  This will prevent
balancing (not sure if it will have other undesired consequences).
    
    This will detect a new tsever instance after obtaining info from zookeeper.  I am still
looking at code and thinking about what else may need to be done.
    
    @ShawnWalker offline you were wondering if it was possible to have info about the same
host+port twice in the master memory.  I looked at how the current code works and it attempts
to prevent this.  Below is what I found :
     
      * The master uses `tserverStatus` for balancing which is populated at [Master line 977](https://github.com/apache/accumulo/blob/97fdfc5912ce07615b9e85899675adc3d1b12578/server/master/src/main/java/org/apache/accumulo/master/Master.java#L977)
by calling `gatherTabletInformation()`
     * `gatherTabletInformation()` calls `LiveTserverSet.getCurrentServers()`  which returns
`currentInstances` which us updated by [LiveTserverSet.checkServer()](https://github.com/apache/accumulo/blob/97fdfc5912ce07615b9e85899675adc3d1b12578/server/base/src/main/java/org/apache/accumulo/server/master/LiveTServerSet.java#L273).
    
    The implementation of `checkServer()` uses the `current` map to ensure a tserver+port
only has one entry in `currentInstances`.  The `current` map key is based on the zookeeper
lock name which is host+port as set on [TabletServer line 2309](https://github.com/apache/accumulo/blob/97fdfc5912ce07615b9e85899675adc3d1b12578/server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java#L2309)
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message