Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DFFB018106 for ; Tue, 29 Dec 2015 16:17:49 +0000 (UTC) Received: (qmail 22597 invoked by uid 500); 29 Dec 2015 16:17:49 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 22526 invoked by uid 500); 29 Dec 2015 16:17:49 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 22327 invoked by uid 99); 29 Dec 2015 16:17:49 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Dec 2015 16:17:49 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 913A62C1F54 for ; Tue, 29 Dec 2015 16:17:49 +0000 (UTC) Date: Tue, 29 Dec 2015 16:17:49 +0000 (UTC) From: "Eric Newton (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-4092) metadata table corruption on recovery MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074041#comment-15074041 ] Eric Newton commented on ACCUMULO-4092: --------------------------------------- Talking to [~kturner] and he pointed out one really good way to detect this early: use conditional mutations to verify the state of the metadata table before making updates. > metadata table corruption on recovery > ------------------------------------- > > Key: ACCUMULO-4092 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4092 > Project: Accumulo > Issue Type: Bug > Components: tserver > Affects Versions: 1.6.4 > Environment: large production system, 1.6.2 with local patches, hadoop 2.2 > Reporter: Eric Newton > > I suspect that we are getting metadata table corruption on WAL recovery. There have been several hints that this has occurred over the past 2 years, but I have not had strong evidence for it until today. > A large production cluster was recently upgraded to 1.6.4. Upon shutdown, it had several consistency check failures. > When a tablet is unloaded, it double-checks the entries for the tablet held in memory against the metadata for the tablet. When the production system was restarted for the upgrade, this check failed for several tablets. In particular, there were file references for the tablet, that did not exist in memory. > This particular system has a very large table which is organized by date. Almost all of the tablets that failed the check occurred on the same date. If the metadata tablet for those tablets was recovered on that date, and there is some bug recovering the WAL entries, they would have affected multiple tablets on the same day. > After searching around the logs, we did find that the metadata tablet for the corrupt tablets did experience a recovery on the date in question. Unfortunately, the WAL files were GC'd many weeks ago. > We need more information to track down the bug. Some possible ways to get this information include: > 1) add periodic consistency checks: It's simple, and would detect problems earlier. In a test environment, we might be able to keep all the archived WALs. > 2) upon metadata tablet recovery, the master could issue a request for consistency checks for the affected tablets. If checks fail, the recovery logs could be archived. > 3) add metadata splits to the long-running tests which would add many more metadata tablet recoveries > I suspect the bug is subtle, and may not cause data loss, since we don't see data loss in continuous ingest tests. But that doesn't mean that deleted data isn't being returned to a table, since the CI test does not delete data. > The uptime for this system is measured in months and includes several hundred nodes. The metadata tablet is spread over most of the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)