Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1CE9D427 for ; Fri, 7 Sep 2012 01:20:07 +0000 (UTC) Received: (qmail 21316 invoked by uid 500); 7 Sep 2012 01:20:07 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 21181 invoked by uid 500); 7 Sep 2012 01:20:07 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 21172 invoked by uid 99); 7 Sep 2012 01:20:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Sep 2012 01:20:07 +0000 Date: Fri, 7 Sep 2012 12:20:07 +1100 (NCT) From: "Christopher Tubbs (JIRA)" To: dev@accumulo.apache.org Message-ID: <1013684742.47908.1346980807809.JavaMail.jiratomcat@arcas> In-Reply-To: <457052284.28479.1336159368123.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (ACCUMULO-575) Potential data loss when datanode fails immediately after minor compaction MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christopher Tubbs updated ACCUMULO-575: --------------------------------------- Affects Version/s: (was: 1.5.0) > Potential data loss when datanode fails immediately after minor compaction > -------------------------------------------------------------------------- > > Key: ACCUMULO-575 > URL: https://issues.apache.org/jira/browse/ACCUMULO-575 > Project: Accumulo > Issue Type: Bug > Components: tserver > Affects Versions: 1.4.1, 1.4.0 > Reporter: John Vines > Fix For: 1.5.0 > > > So this one popped into my head a few days ago, and I've done some research. > Context- > 1. In memory map is written to an RFile. > 2. yadda yadda yadda, FSOutputStream.close() is called. > 3. close() calls complete() which will not return until the dfs.replication.min is reached. dfs.replication.min is by default set to 1 on systems and I don't think it's frequently configured > 4. We read the file to make sure that it was written correctly (this has probably been a mitigating factor as to why we haven't run into this potential issue) > 5. We write the file to the !METADATA table > 6. We write minor compaction to the walog > If the datanode goes down after 6 but before the file is replicated more, then we'll have data loss. The file will be known to the namenode as corrupted, but we can't restore it automatically, because the walog has the file complete. Step 4 has probably provided enough of a time buffer to significantly decrease the possibility of this happening. > I have not explicitly tested this, but I want to test to validate the potential scenario of losing data by dropping a datanode in a multi-node system immediately after closing the FSOutputStream. If this is the case, then we may want to consider adding a wait between steps 4 and 5 that polls the namenode for replication reaching at least the max(2, # nodes). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira