accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-4000) log recovery failed after hard reset
Date Wed, 16 Sep 2015 15:40:45 GMT
Eric Newton created ACCUMULO-4000:
-------------------------------------

             Summary: log recovery failed after hard reset
                 Key: ACCUMULO-4000
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4000
             Project: Accumulo
          Issue Type: Bug
         Environment: very large cluster, accumulo 1.6.2, hadoop 2.5.0 (cdh 5.3)
            Reporter: Eric Newton
            Assignee: Eric Newton


Had a hardware failure on a single node within a large cluster.  Tablets were migrated away,
but one tablet would not recover.  The Closer run by the master to release the write lease
on the WAL failed repeatedly.

Afterwards, it was determined the file was small, probably just opened and used at the moment
the machine failed.  The block could not be recovered from any replicas.

One question raised: does the write pipeline acknowledge the sync, before the write pipeline
completes?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message