accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [accumulo] jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
Date Wed, 02 Oct 2019 14:58:38 GMT
jzgithub1 commented on issue #1225: Use fewer ZooKeeper watchers
URL: https://github.com/apache/accumulo/issues/1225#issuecomment-537533690
 
 
   I have implemented what @ctubbsii recommended at the top of the ticket but it is not perfected
yet (even though it seems to run well) so I will not make a pull request at this time.  
   
   What I learned along the way by trace debugging ZooCache.get(zPath) function and the process
function in ZCacheWatcher (and other parts of the code also)  pretty much convinces me that
@ctubbsii's idea is a sound solution.   The main reason for this is that it will prevent the
removal from the ZCache of all of the table configurations for all of the tables  during a
"createtable" and "deletetable" or "clonetable" operation.   The TServers wipe out Znodes
from the cache that start with "/accumulo/{INSTANCE_ID/tables" during  the former mentioned
operations (this occurs for important reasons I will try to understand more thoroughly). 
Then when ZCache.get(path) is called for a configuration path that should not have been deleted,
 that value is not in the ZCache and needs to get refreshed by a new call to Zookeeper exist
and then getData.  If you have a lot of tables over many TServers doing a lot of add/deletes
of tables this will burden Zookeeper.  
   
   I moved the table configurations out of the former mentioned  Zoo Path to /accumulo/{INSTANCE_ID}/table_configs/table/{TABLE_ID}/conf
 to prevent the erasure and then re-fetching of table configurations from Zookeeper.  
   
   I can see in the trace debug that calls to get the table configurations are using the new
ZNode path and they are consistently retrieved from the ZooCache with contacting the Zookeeper
again.  This is a good thing.  The present code pulls from the cache too but it has to refreshed
again if the "/accumulo/{INSTANCE_ID/tables" path has bee wiped out in the "NodeDeleted" case
in ZCacheWatcher.process.
   Even though a Watcher is still placed on each table configuration item in my code its a
one time event usually.  I don't think that putting a watcher on ZNode path "/accumulo/{INSTANCE_ID}/table_configs/"
is required.   We would have to implement an action NodeChildrenChanged case part of the ZCacheWatcher.process
function (I have done this in one of my branches) which is burdensome on zookeeper.  Maybe
just calling ZooCache.get(event.getPath) in the NodeDataChanged case part of process would
be more efficient.
   
   I have re-set these new ZooNode paths inside the Zookeeper CLI and seen them update in
the ZooCache just fine with trace debugging.  In addition I have run accumulo-testing's createtable
and ingest and everything works fine.  Cloning the ingested table is working too.   I will
ask @ivakegg take a look at my solution sometime this week if he has time.
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message