accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] keith-turner opened a new issue #537: Recovery of WAL may see an incomplete set of logs
Date Thu, 21 Jun 2018 18:08:51 GMT
keith-turner opened a new issue #537: Recovery of WAL may see an incomplete set of logs
URL: https://github.com/apache/accumulo/issues/537
 
 
   Tablet servers track the active set of WALs (write ahead logs) in zookeeper.  When a tablet
server dies all WALs listed in zookeeper are used for recovery.  Tablet servers determine
which write ahead logs are active based on which tablets reference WALs.  If a tablet server
allocates three WALs over time W1, W2, and W3 then its possible that tablets only reference
W1 and W3.  If that tablet server dies, then only W1 and W3 would be used for recovery.  However,
W2 may contain information that is important to some tablets.   Consider the following data.
   
    * Data in W1 :
      * Mutation for tablet T1 setting rowX:colY=valZ
    * Data in W2 :
       * Mutation for tablet T1 deleting rowX:colY
       * Start Minor Compaction event for T1
       * Finish Minor Compaction event for T2 
    *   Data in W3
      * Other data unrelated to T1
   
   So if the tablet server dies and only W1 and W3 are used for recovery, then tablet T1 will
bring back the deleted rowX:colY.  It does this because it does not see the data in W2 during
recovery.  If the data in W2 was seen during recovery, then the tablet would know it had minor
compacted and no data needed to be recovered.
   
   Discovered this issue as a result of looking into and discussing #535 with @ctubbsii .
 This bug only impacts Accumulo 1.8.0 and later.  The bug is a result of the change in 1.8.0
to track WALs per tablet servers instead of per tablet.   Before 1.8.0, the tablet T1 would
have had not WALs associated with it after minor compacting.    
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message